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Preface 

The  purpose  of  this  reyffarCff  is  to  compare  the  miriinuin  distance 
estimation  technique  with  the  best  linear  unbiased  estimation  technique 
to  determine  which  estimator  provides  more  accurate  estimates  of  the 
underlying  location  and  scale  parameter  values  for  a  given  Pareto 
distribution.  Two  forms  of  the  Kolmogorov,  Anderson-Oarling,  and 
Cramer-von  Mises  minimum  distance  estimators  are  tested.  A  Monte  Carlo 
methodology  is  used  to  generate  the  Pareto  random  variates  and  the 
resulting  estimates.  A  mean  square  error  comparison  is  then  performed 
to  evaluate  which  estimator  provides  the  best  results.  Additionally, 
various  sample  sizes  and  shape  parameters  are  also  used  to  determine 
whether  they  have  an  influence  on  a  given  estimator’s  performance.' 
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Abstract 

This  investigation  ccnpared  the  nininun  distance  estimation 
technique  with  the  best  linear  unbiased  estimation  technique  to 
determine  which  technique  provided  more  accurate  estimates  of  the 
location  and  scale  parameter  values  when  applied  to  the  three  parameter 
Pareto  distribution.  Six  distinct  minimum  distance  estimators  were 
developed.  Of  these  six,  two  were  based  on  the  Kolmogorov  distance,  two 
were  based  on  the  Anderson-Darling  distance,  and  two  were  based  on  the 
Cramer-von  Mises  distance.  For  a  given  sample  size  and  Pareto  shape 
parameter,  the  location  and  scale  parameters  were  estimated. 
Additionally,  varying  combinations  of  sample  sizes  (6,  9,  12,  15,  or  18) 
and  shape  parameters  (1.0,  2.0,  3.0,  or  4.0)  were  tested  to  investigate 
the  affect  of  such  changes. 

A  Monte  Carlo  methodology  was  used  to  generate  the  1000  sample  sets 
of  Pareto  random  variates  for  each  sample  size  -  shape  parameter 
combination  with  location  and  scale  parameters  both  set  to  a  value  of  1. 
The  best  linear  mbiased  estimator  and  the  six  minimum  distance 
estimators  then  provided  parameter  estimates  based  on  the  sample  sets. 
Finally,  these  estimates  were  compared  using  the  mean  square  error  as 
the  evaluation  tool.  The  results  of  this  investigation  indicate  that 
the  best  linear  unbiased  estimation  technique  provided  more  accurate 
estimates  of  location  and  scale  for  the  three  parameter  Pareto 
distribution  than  did  the  minimum  distance  estimation  techniques. 
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A  COMPARISON  OF  ESTIMATION  TECHNIQUES  FOR 


THE  THREE  PARAMETER  PARETO  DISTRIBUTION 

I .  Introduction 

Parameter  estimation  is  an  important  underlying  technique  in 
statistical  analysis.  Although  the  statistician  can  perform  some 
analysis  intuitively,  estimation  requires  a  specific  method.  For 
example,  if  a  statistician  is  asked  to  analyze  some  sample  data,  he 
could  order  it  in  ascending  order  and  draw  a  histogram  reflecting  the 
occurrence  frequency  of  values  uithin  certain  intervals.  Further,  from 
the  histogram’s  shape,  he  could  guess  the  underlying  population 
distribution.  However,  he  could  not  easily  determine  the  parameters 
<e.g.  mean,  standard  deviation)  of  the  population.  At  this  point,  the 
statistician  needs  a  method  to  estimate  the  true  population  parameters 
from  the  sample  data.  The  method  is  called  the  estimator,  and  the 
approximations  based  on  the  sample  are  the  statistics  (i.e.  the 
estimates).  Mendenhall  defines  an  estimator  as  "a  rule  which 
specifically  states  how  one  may  calculate  the  estimate  based  upon 
information  contained  in  a  sample"  (23:13).  Using  these  rules,  the 
statistician  can  estimate  the  parameters  of  a  population  distribution 
based  on  sample  data  drawn  from  the  population.  These  estimates  then 
summarize  the  properties  of  the  population  for  the  investigator. 

One  estimation  technique,  called  the  best  linear  unbiased 
estimator  (BLUE),  relies  on  a  linear  combination  of  order  statistics 
(10:265).  Order  statistics  are  a  set  of  variables  arranged  according 


to  their  magnitudes.  For  instance,  ordering  a  set  of  observed  random 
variables  (e.g.  fastest  times  in  an  automobile  race)  from  smallest  to 
largest  results  in  a  set  of  order  statistics  (24=229).  The  best 
linear  unbiased  estimator  (T)  can  be  used  to  estimate  an  unknown 
population  parameter  (0)  where  T  is  only  dependent  on  the  values  of  n 
independent  random  variables.  In  addition,  the  estimator,  T,  must  bo 
linear  in  the  set  of  n  random  variables.  The  estimator  must  also 
display  the  minimum  variance  among  linear  estimators  and  must  be 
unbiased  ( 10  =  2GS-2GG ).  In  simple  terms,  unbiased  means  that  on  the 
average,  the  value  of  the  estimator  equals  the  parameter  being  estimated 
(33=197).  Therefore,  by  combining  a  set  of  order  statistics  in  a  linear 
fashion,  one  can  produce  estimators  for  the  underlying  population 
parameters.  If  these  estimators  also  possess  the  properties  of  minimum 
variance  and  unbiasedness,  then  they  are  called  best  linear  unbiased 
estimators . 

Another  parameter  estimation  technique  is  minimum  distance 
estimation,  introduced  by  Wolfowitz  in  the  19505  as  a  method  which  "in  a 
wide  variety  of  cases,  will  furnish  super  consistent  estimators  even 
when  classical  methods ...  fail  to  give  consistent  estimators"  (38  =  9).  A 
minimum  distance  estimator  is  consistent  if,  as  the  sample  size 
increases,  the  probability  that  the  estimate  approaches  the  true  value 
of  the  parameter  also  increases  (33=199).  The  minimum  distance 
estimation  technique  is  closely  related  in  theory  to  the  statistical 
procedure  called  goodness  of  fit  because  a  distance  measure  is  the 
evaluation  criteria  for  both  procedures.  In  goodness  of  fit,  one  tests 
the  sample  data  to  identify  its  underlying  unknown  distribution.  A 


goodness  of  fit  test  is  "a  test  designed  to  compare  the  sample  obtained 
with  the  type  of  sample  one  would  expect  from  the  hypothesized 
distribution  to  see  if  the  hypothesized  distribution  function  'fits’  the 
data  in  the  sample"  (8:189).  Certain  goodness  of  fit  tests  are  based  on 
a  distance  measure  between  the  sample  and  a  hypothesized  distribution 
with  known  population  parameters.  Minimum  distance  estimation,  however, 
reverses  the  goodness  of  fit  approach  by  assuming  a  probability 
distribution  type  and  then  finding  the  values  that  minimize  the  distance 
measure.  These  values  become  the  estimates  of  the  population  parameters 
< 18:34). 

Even  though  the  minimum  distance  estimation  technique  was  developed 
in  1953,  researchers  have  not  extensively  studied  the  technique  until 
recently.  Parr  and  Schucany  reported  in  1979  that  the  method  yields 
"strongly  consistent  estimators  with  excellent  robustness  properties" 
(27:5)  when  used  to  estimate  the  location  parameter  of  symmetric 
distributions  (27:5).  Robustness  of  an  estimator  is  its  ability  to 
serve  as  a  good  estimator  even  when  the  distribution  assumptions  are  not 
strictly  followed  (27:3).  Additionally,  several  Air  Force  Institute  of 
Technology  (AFIT)  students,  under  the  guidance  of  Dr.  Albert  H.  Moore, 
have  completed  thesis  research  projects  by  applying  the  minimum  distance 
estimation  technique  to  specific  distributions  and  comparing  this 
technique  with  other  estimation  methods.  These  former  students  include 
Maj  McNeese,  working  with  the  generalized  exponential  power 
distribution:  Capt  Daniels,  working  with  the  generalized  t  distribution: 
Capt  Miller,  working  with  the  three  parameter  Weibull  distribution:  Capt 
James,  working  with  the  three  parameter  gamma  distribution:  2Lt 
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Bertrand,  yorking  with  the  four  parameter  beta  distribution;  and  2Lt 
Keffer ,  working  with  the  three  parameter  lognormal  distribution. 

Results  have  generally  shown  that  minimum  distance  estimators  provide 
better  estimates  (i.e.,  estimates  closer  to  the  actual  population 
parameters)  than  the  other  technicues  used  <4:9). 

The  literature  search  reveals  that  the  capabilities  of  the  minimum 
distance  estimation  technique  have  not  been  compared  with  those  of  the 
best  linear  unbiased  estimator  with  regard  to  the  Pareto  distribution,  a 
distribution  of  considerable  value.  The  Pareto  distribution  has  a 
variety  of  uses  in  the  commercial  sector.  Johnson  and  Kotz  identify 
several  Pareto  distribution  analysis  areas,  including  city  population 
distribution,  stock  price  fluctuation,  and  oil  field  location  (16:242). 
In  addition  to  commercial  users,  the  Air  Force  also  uses  the  Pareto 
distribution  in  a  number  of  analysis  areas-'  time  to  faili  re  of  equipment 
components  (9),  maintenance  service  times  (14),  nuclear  fallout 
particles’  distribution  (11),  and  error  clusters  in  communications 
circuits  (3).  In  sun,  the  Pareto  distribution  proves  to  be  a 
distribution  worthy  of  further  investigation.  Use  of  the  minimum 
distance  estimation  technique  applied  to  the  Pareto  distribution  offers 
the  researcher  a  chance  to  expand  the  frontier  of  knowledge  in  this 
area. 

SPECIFIC  PROBLEM 

Researchers  have  not  explored  the  potential  of  the  minimum  distance 
estimation  technique  to  improve  upon  the  best  linear  unbiased  estimation 
technique  as  applied  to  the  Pareto  distribution.  A  comparison  of  the 
techniques  in  a  controlled  environment  is  needed  to  evaluate  which 
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technique  performs  better  under  given  circumstances.  The  controlled 
environment  should  specify  the  sample  size  and  the  value  of  the 
parameters  of  the  underlying  Pareto  distribution  function  for  each 
comparison  attempt. 


RESEARCH  QUESTION 

For  specified  parameter  values  and  sample  sizes,  uhich  estimation 
technique,  minimum  distance  or  best  linear  unbiased,  performs  better 
when  applied  to  the  Pareto  distribution? 

GENERAL  APPROACH 

Monte  Carlo  analysis  is  the  analytical  method  to  be  used  to  make 
the  estimation  technique  comparison.  Monte  Carlo  analysis  of  estimation 
methods  consists  of  three  steps.  First,  one  generates  random  variates 
from  a  specified  Pareto  distribution  (i.e.,  a  Pareto  distribution  with 
known  parameters).  Second,  the  two  estimation  techniques  are  used  to 
obtain  parameter  estimates  based  on  the  random  sample  data  from  the 
first  step.  Third,  the  resulting  estimates  are  compared  to  determine 
which  estimation  technique  provided  the  better  parameter  estimates 
(4=27).  The  mean  square  error  technique  can  be  used  to  perform  this 
evaluation  <  4- 31 > . 

SEQUENCE  OF  PRESENTATION 

This  report  will  proceed  with  five  additional  chapters.  The  second 
chapter  will  discuss  the  estimation  techniques  used  in  this  study  while 
the  third  chapter  will  present  the  Pareto  DistriPution.  The  fourth 
chapter  will  describe  the  Monte  Carlo  analysis  methodology  used  to  make 
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the  estination  technique  conparisons.  The  fifth  chapter  uill  present 
the  results  and  conclusions  of  the  study  while  the  sixth  chapter  will 
provide  a  short  summary  and  some  recommendations  for  future  study  in 
this  area. 


II.  Eatimatlon  Techniques 


This  chapter  will  first  provide  a  discussion  on  estimation  in 
general,  some  desirable  properties  of  estimators,  and  the  empirical 
distribution  as  an  estimator  of  the  true  distribution.  Following  this 
discussion,  the  two  estimation  techniques  to  be  compared  in  this  thesis 
will  be  presented.  First  the  best  linear  unbiased  technique  will  be 
discussed  along  with  its  inherent  properties.  Then  the  minimum  distance 
technique  will  be  presented  in  the  three  distance  measure  forms  to  be 
used  throughout  the  rest  of  this  study. 

ESTIMATION 

Estimation  is  part  of  a  larger  area  of  study  called  statistical 
inference.  The  statistician  makes  inferences  about  the  state  of  nature, 
or  the  “way  things  really  are"  (22=187),  based  on  data  gathered  from 
experiments  done  to  discover  something  about  the  state  of  nature 
(22:.187),  Lindgren  then  narrows  his  discussion  of  statistical  problems 
to  decision  problems,  eliminating  the  areas  of  experimental  design  and 
representative  data  gathering. 

Some  statistical  problems,  notably  in  business 
and  industry,  are  decision  problems,  in  which  the 
partial  information  about  the  state  of  nature  provided 
by  data  from  experimentation  is  used  as  the  basis  of 
making  an  immediate  decision  122=188]. 

Lindgren  then  describes  the  general  decision  problem  as  consisting  of  "a 

set  or  'space'  A  of  possible  actions  that  might  be  taken,  the  individual 

'points'  of  this  space  being  the  individual  actions’  (22=188).  He 

finally  defines  estimation  problems  as  “those  in  which  the  action  space 

A  IS  identical  with  the  space  of  parameter  values  that  index  the  family 
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of  possible  states  of  nature"  {22-188).  In  this  case,  states  of  nature 
coi.id  be  described  by  the  distribution  function  family  members,  each 
member  being  defined  through  its  own  set  of  parameter  values. 

Pritsker  describes  the  concept  of  parameter  estimation  by 
presenting  two  supporting  definitions.  He  first  defines  the 
’population*  as  the  set  of  data  points  consisting  ‘of  all  possible 
observations  of  a  random  variable*  (31:46).  He  then  defines  a  'sample* 
as  being  "only  part  of  these  observations"  (31:46).  A  method  to 
summarize  a  set  of  data  is  ‘to  view  the  data  as  a  sample  which  is  then 
used  to  estimate  the  parameters  of  the  parent  or  underlying  population* 
(31:46).  Runyon  and  Haber  simply  define  a  parameter  as  "a  summary 
numerical  value  calculated  from  a  population*  (33:4). 

Liebelt  indicates  that  the  estimation  problem,  defined  earlier  by 
Lindgren,  is  difficult  to  solve.  In  fact,  because  there  can  be  many 
estimates  regarding  a  problem,  the  solution  is  not  unique.  Therefore, 
the  statistician  begins  searching  for  the  'best*  estimate;  but,  since 
the  criteria  for  a  'best*  estimate  is  arbitrary,  there  cannot  be  an 
optimal  estimate  to  solve  all  problems  (21:135-136).  "Each  problem  may 
require  a  different  set  of  optimal  criteria:  the  choice  is  always  left 
to  the  user  of  estimation  theory"  (21:136).  So,  the  search  always 
continues  for  a  better  estimator.  This  thesis  is  a  continuation  of  that 
search. 

Before  we  continue  by  listing  and  defining  some  of  the  agreed  upon 
properties  of  a  good  estimator,  we  must  clarify  the  difference  between 
an  estimator  and  an  estimate.  Mendenhall  explains  that  an  estimator  is 
"a  rule  which  specifically  states  how  one  may  calculate  the  estimate 
based  upon  information  contained  in  a  sample"  (23:13).  However,  when 
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the  estimator  is  used  to  produce  a  particular  value  based  on  specified 


sample  data,  "the  resulting  predicted  value  is  called  an  estimate" 
(23:13).  Wine  draws  an  analogy  to  describe  the  difference.  He 
indicates  the  distinction  between  the  two  is  the  same  as  the  difference 
between  a  function,  f(x),  and  the  evaluated  functional  value,  f(c). 
"f(x)  15  a  variable  defined  in  some  domain  of  x,  and  f(c)  is  a  constant 
corresponding  to  a  specified  value  of  x  equal  to  constant  c“ 
(37:170-171).  Before  a  sample  is  drawn,  we  have  an  estimator.  After 
the  sample  is  drawn,  the  estimator  produces  a  particular  value  which  is 
an  estimate  (37=171). 

ESTIMATOR  PROPERTIES 

The  search  for  better  estimators  continues:  but,  what  is  the 
criteria  for  determining  a  good  estimator?  Certain  properties  of 
estimators  have  been  defined  and  seem  to  be  reasonable  guides  for 
choosing  good  estimators,  although  these  criteria  cannot  be  fully 
"justified  except  on  the  basis  of  intuition"  (21=136).  This  section 
will  discuss  four  of  these  desirable  properties.  If  an  estimator  is  to 
be  used  in  repeated  samplings  from  the  same  population,  then 
unbiasedness  is  a  desirable  property:  otherwise,  a  biased  estimator 
could  possibly  be  found  which  provides  better  parameter  estimates. 
Additionally,  a  good  estimator  should  be  consistent,  efficient  and 
invariant.  Each  of  thei.e  properties  will  now  be  described  in  more 
detail . 

Unbiased  Estimators .  The  first  property  a  good  estimator  to  be 
used  in  repeated  samplings  from  the  same  population  is  unbiasedness. 
Freeman  defines  an  unbiased  estimator  as  follows: 
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Ue  have  a  populaiion  described  by  the  density  function 
f<x;6>,  uhere  f  is  knoun  and  the  value  of  the  parameter 
is  unknown,  ft  random  sample  x^,x  , ...,x  is  drawn  from 
this  population.  The  statistic  ttx^ .x^i . . . .x^>  is  an 
unbiased  estimator  of  the  parameter  0  if 

E(t>  =  0  (Z.l) 

for  all  n  and  for  any  possible  value  of  0  [10:2291. 

Wine  points  out  that  this  definition  ‘requires  that  the  mean  of  the 
sampling  distribution  of  any  statistic  equals  the  parameter  which  the 
statistic  is  supposed  to  estimate*  (37:172).  In  other  words,  the 
expected  value  of  the  statistic  t  equals  the  parameter  being  estimated, 
where  ‘the  expected  value  of  a  random  variable  x  with  density  function 
f(v)  is  defined  as 


E<x)  -  f  *v  f(v>dv  (2.2) 

•'-so 

(21:05).  Freeman  defines  the  term  density  function  as  "a  function 
f<x^)  which  is  connected  to  probability  statements  on  the  random 
variable  x  by 


p(x  =  x)  =  f'x.)  (2.3) 

1  1 

(10:18).  Looking  at  unbiasedness  from  a  slightly  different  perspective, 
Liebelt  says  that  unbiasedness  ‘is  desirable,  for  it  states  that  in  the 
absence  of  measurement  error,  and  uncertainty  in  the  estimation 
procedure,  the  estimate  becomes  the  true  value*  (21:137).  Freeman  adds 
a  final  note  concerning  unbiased  estimators.  He  Indicates  that  for  an 
estimator  to  be  truly  unbiased.  Eq  (2.1)  ‘is  required  to  hold  for  all 
sample  sizes  n*  (10=223).  There  are  cases  when  Eq  (2.1)  roughly  holds 


only  for  very  large  sample  sizes.  In  these  cases,  the  estimator  is 
merely  'asymptotically  unbiased’  (10:229). 

Unbiasedness  is  an  important  property  for  an  estimator  to  have  in 
repeated  samplings  from  the  same  population.  The  reason  for  this 
statement  becomes  apparent  uhen  one  looks  at  uhat  can  happen  if  an 
estimator  is  biased.  “Any  estimating  process  used  repeatedly  and  uhich 
on  the  average  (mean)  is  not  equal  to  the  parameter  leads  to  a  sure 
cumulation  of  error  in  one  direction*  (10=229).  To  avoid  this 
accumulation  of  error  in  one  direction,  the  statistician  seeks  to  find 
and  use  unbiased  estimators.  However,  in  a  single  estimation  situation, 
unbiasedness  may  not  be  desireable.  Instead,  one  could  seek  to  minimize 
the  mean  square  error  of  the  estimate  which  could  then  result  in  a 
better  estimate. 

Consistent  Estimators.  The  second  property  of  a  good  estimator  is 
that  of  consistency.  As  the  sample  size  increases,  one  would  want  the 
risk  associated  with  the  estimator  to  decrease.  “That  is,  the  estimator 
ought  to  be  better  when  it  is  based  on  twenty  observations  than  when  it 
is  based  on  two  observations*  (25=172).  This  supposition  portrays  the 
idea  of  consistency.  “An  estimator  is  consistent  if  for  a  large  sample 
there  is  a  high  probability  that  the  estimator  will  be  near  the 
parameter  it  is  intended  to  estimate"  (5=140). 

A  similar  definition  expressed  by  Wine  uses  the  idea  of 
convergence  to  define  a  consistent  estimator.  An  estimator,  t,  of  the 
parameter  0  is  consistent  if,  for  any  small  numbers  d  and  e,  “there 
exists  an  integer  n'  such  that  the  probability  that  lit  -  01  <  el 
IS  greater  than  Il-dl  for  all  n  >  n’  “  (37=171).  This 
definition  introduces  the  idea  of  convergence  by  saying,  “given  any 
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snail  [el,  ue  can  find  a  sample  size  large  enough  so  that,  for 
all  larger  sample  sizes,  the  probability  that  [t]  differs  from  the 
true  value  0  [by]  more  than  e  is  as  small  as  ue  please"  (37:171), 
Therefore,  the  estimator,  t,  converges  in  probability  to  0  (37:171). 
Consistency,  then,  implies  that  as  sample  sizes  increase,  the 
probability  also  increases  that  the  estimator  provides  estimates  which 
more  closely  approximate  the  true  value  of  the  parameter  being 
estimated. 

Efficient  Estimators.  The  third  desirable  property  of  a  good 
estimator  is  that  of  efficiency.  Efficiency  is  generally  used  as  a 
measure  to  compare  two  estimators.  The  efficiency  is  the  ratio  of  their 
mean  square  errors.  Mendenhall  and  Scheaffer  indicate  that  the  mean 
square  error  can  be  written  as  the  summation  of  the  variance  and  the 
square  of  the  bias  of  an  estimator  (24:2B7). 

Since  variance  is  a  measure  of  the  dispersion  of  the  distribution 
of  an  estimator  about  the  parameter  value,  the  statistician  seeks  an 
estimator  with  small  variance.  By  selecting  an  estimator  with  the 
smaller  variance,  he  ensures  that  his  estimates  will  converge  more 
rapidly  to  the  true  parameter  value  (32:155).  Therefore,  “one  estimator 
15  said  to  be  more  efficient  than  another  when  the  variability  of  its 
sampling  distribution  is  less"  (33:198). 

Invariant  Estimators.  The  final  property  of  a  good  estimator  is 
that  of  invariance.  Invariance  is  particularly  desirable  when 
functional  transformations  must  be  made  regarding  the  parameter.  As 
Freeman  states: 
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We  call  a  method  of  estimation  invariant  under 
transformation  of  a  parameter  if,  when  the  method 
leads  to  t  as  the  estimator  of  0,  the  method  also 
leads  to  g(  t )  as  the  estimator  of  q<0).  We  can 
speak  of  t  as  an  invariant  estimator  for  a  certain 
class  of  transformations  g  if,  when  the  parameter  0 
IS  transformed  by  g  to  g(0),  the  estimator  t  is 
transformed  to  g( t )  £10:2331. 

If  the  statistician  is  working  with  an  invariant  estimator  where 
the  estimate  of  0  is  t,  then  he  can  conclude  that  his  estimate  for 
0  +  k  is  t  +  k  and  his  estimate  for  k0  is  kt  (10:233). 

Thus,  the  property  of  invariance  permits  the  transformation  of  a 
parameter  to  be  translated  into  the  transformation  of  its  estimator. 

Summary.  Three  desirable  properties  of  an  estimator  are 
consistency,  efficiency,  and  invariance.  Unbiasedness  is  desirable  when 
the  estimator  is  used  in  repeated  sampling  from  the  sane  population. 
Unbiasedness  means  that,  on  the  average,  the  estimator  equals  the 
parameter  being  estimated.  Consistency  means  that  as  the  sample  size 
increases,  the  estimator  will  more  closely  approximate  the  true 
parameter  value.  Eff;~iency  is  a  comparative  measure  between  estimators 
where  the  estimator  with  the  smaller  mean  square  error  is  more 
efficient.  Finally,  invariance  means  that  if  a  transformation  operation 
IS  performed  on  a  parameter,  the  identical  transformation  can  be 
performed  on  the  estimator  resulting  in  the  transformed  estimator 
becoming  a  valid  estimator  for  the  transformed  parameter.  Although 
these  properties  are  desirable,  estimators  generally  do  not  possess  all 
of  these  properties.  Therefore,  the  statisticians  must  find  an 
estimator  with  the  properties  needed  for  their  particular  applications. 
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EMPIRICAL  DISTRIBUTION  FUNCTION  (EOF) 

An  empirical  distribution  is  a  distribution  based  solely  on  sample 
values  of  a  random  variable.  The  empirical  distribution  can  be  thought 
of  as  an  estimation  of  the  true  underlying  population  distribution.  Th 
empirical  distribution  is  developed  “by  observing  several  values  of  the 
random  variable  and  constructing  a  graph  S(x)  that  nay  be  used  as  an 
estimate  of  the  entire  unknown  distribution  function  F(x)  of  the  random 


variable  <8=59).  Conover  defines  the  empirical  distribution  as  follows 

Let  Xj,  X^t---.  X  be  a  random  sample.  The  empirical 
distribution  funcSion  S(  x )  is  a  function  of  x,  which 
equals  the  fraction  of  X.s  that  are  less  than  or  equal 
to  X  for  each  x,  -«o  <  x^<  »  [8:691. 

Based  on  this  definition,  the  graph  of  the  empirical  distribution 

function,  S(x),  is  a  step  function  starting  at  zero.  As  each  sample 

value  (ordered  from  lowest  to  highest)  is  encountered,  a  step  of  height 

1/n  IS  entered  on  the  graph.  This  procedure  continues  until  all  the 

sample  values  have  been  entered  and  a  height  of  one  has  been  reached. 

“S(x)  resembles  a  distribution  function  in  that  it  is  a  nondecreasing 

function  that  goes  from  zero  to  one  in  height.  However,  S(x)  is 

empirically  (from  a  sample)  determined  and  therefore  its  name*  (8=70). 

The  empirical  distribution  function  is  used  as  an  estimator  for  the 

population  distribution  function  of  the  random  variable  (8:70). 

From  the  empirical  distirbution  function,  one  can  “compute  the 

expectation  of  the  empiric  random  variable,  E(x).  We  have 

n  n 

E(x)  =  I.  ,  X  (l/n)  =  (1/n)  I  ,  x. 

1=1  1  1=1  1  (Z.4) 

which  IS  just  the  sample  mean,  x  (5=137).  Eq  (2.4)  uses  the 


discrete  random  variable  form  of  the  expected  value  definition. 
Therefore,  assuming  the  empirical  distribution  acceptably  estimates  the 
population  distribution  leads  to  the  sample  mean  being  an  acceptable 
estimate  for  the  population  mean  (5:138). 

BEST  LINEftR  UNBIASED  ESTIMftTOR  (BLUE) 

Knowing  uhat  properties  are  desirable  in  an  estimator  still 
leaves  the  statistician  with  the  problem  of  developing  an  estimator. 

One  estimator  is  called  the  best  linear  unbiased  estimation  technique. 
As  was  mentioned  in  Chapter  I,  the  BLU  estimator  is  based  on  order 
statistics,  which  is  simply  an  arrangement  of  random  variables  in 
order  of  magnitude  (24=229).  A  population  parameter  (6)  can  be 
estimated  by  a  statistic  (T)  which  depends  only  on  the  values  of  n 
independent  random  variables;  x^,  x^,  .  .  .  ,  x^  (10:255). 

The  title  of  this  estimator  indicates  some  of  the  properties  that 
it  possesses.  Namely,  the  estimator  must  be  unbiased,  'best*,  and 
linear.  As  was  discussed  ealier  in  this  chapter,  an  unbiased 
estimator  has  a  bias  term  equal  to  zero,  and,  on  the  average  over  many 
trials,  the  estimator  provides  estimates  equal  to  the  parameter  value. 
Eq  2.1  states  the  property  mathematically.  In  addition  to  being 
unbiased,  the  BLU  estimator  must  be  'best*.  To  be  best  among  unbiased 
estimators,  the  estimator  must  have  the  minimum  mean  square  error 
(10:265).  The  mean  square  error  is  the  sum  of  the  variance  term  and 
the  square  of  the  bias  term  (24=267).  Since  we  are  dealing  with  an 
estimator  which  is  inherently  unbiased  (i.e.,  the  bias  tern  equals 
zero),  the  mean  square  error  simply  reduces  to  the  variance  tern. 
Therefore,  in  this  case,  best  implies  minimum  variance.  Finally,  the 
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BLU  estimator  must  be  linear.  Linearity  demands  that  ue  consider  only 


"estimators  which  are  linear  in  the  random  variables  x, ,  .  .  .  ,x  , 

1  n 

for  it  IE  only  in  comparison  with  other  esimators  within  this 
restricted  class  that  we  can  always  find  estimators  [which  are  best 
unbiased]"  (10:26G).  Stated  mathematically,  the  estimator  appears  as 
follows: 


T  =  c, X,  +  .  .  ,  +c  X  (2.5) 

11  n  n 

where  the  coefficients  (c.)  must  be  determined  (10:266). 

1 

In  addition  to  the  properties  described  above,  the  best  linear 
unbiased  estimator  possesses  another  desirable  feature,  that  of 
invariance.  Mood  and  Graybill  indicate  that  BLU  estimators  are  a  subset 
of  least-squares  estimators  (25:349>.  Further,  they  state  that,  in 
general,  least  square  estimators  do  not  possess  the  invariance  property. 
"There  is  one  important  case,  however,  when  the  invariant  proparty  holds 
for  least-squares  estimators,  and  this  is  the  case  of  linear  functions" 
(25:350).  Therefore,  in  addition  to  Oeing  unbiased,  and  possessing 
minimum  variance,  the  BLU  estimator  is  also  invariant. 

MINIMUM  DISTANCE  (MD)  ESTIMATOR 

Chapter  I  presented  a  partial  history  and  description  of  the 
minimum  distance  estimation  technique.  The  efforts  of  Wolfowitz 
culminated  in  his  1957  paper  which  refined  his  work  toward  “developing 
the  minimum  distance  method  for  obtaining  strongly  consistent  estimators 
(i.e.,  estimators  which  converge  with  probability  one)"  (39:75).  In  the 
paper,  he  emphasized  that  his  method  could  be  used  with  a  variety  of 
distance  measuring  techniques  (39=75).  Additionally,  Wolfowitz  slalcd 
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that  “it  IS  a  problem  of  great  interest  to  decide  which,  if  any, 

definition  of  distance  yields  estimators  preferable  in  some  sense" 

(39:76).  This  thesis  will  in  part  respond  to  this  challenge,  since 

three  distance  measures  will  be  used  in  the  minimum  distance  method  for 

comparison  against  the  best  linear  unbiased  estimation  method.  The 

three  distance  measures  to  be  used  are  the  Kolmogorov,  the 

Anderson-Darling,  and  the  Cramer-von  Mises  discrepancy  measures. 

Uolfowitz  finally  summarizes  the  minimum  distance  method  as  follows: 

The  estimator  is  chosen  to  be  such  a  function  of  the 
observed  chance  variables  that  the  d.f.  of  the  observed 
chance  variables  (when  the  estimator  is  put  in  place  of 
the  parameters  and  distributions  being  estimated)  is 
'closest*  to  the  empiric  d.f.  of  the  observed  chance 
variables  [39:761. 

Since  1957,  the  minimum  distance  estimation  technique  has  been 

studied  by  many  other  statisticians  and  has  been  found  to  display  other 

desirable  estimator  properties.  The  technique  has  “been  considered  as  a 

method  for  deriving  robust  estimators  by  Knusel  (1969)  and  Parr  and 

Schucany  (1980)"  (28:178).  Additionally,  Parr  and  Schucany  indicate 

that  the  method  yields  “strongly  consistent  estimators  with  excellent 

robustness  properties“  (27=5)  when  used  to  estimate  the  location 

parameter  of  symmetric  distributions  (27:5).  They  define  robust 

estimation  as  “efficient  or  nearly  efficient  (at  a  model)  estimation 

procedures  which  also  perform  well  under  moderate  deviations  from  that 

model"  (27:2).  They  attempt  to  explain  why  the  minimum  distance 

estimator  possesses  robustness  properties: 

It  may  well  be  inquired  as  to  why  an  estimator  obtained 
by  minimization  of  a  discrepancy  measure  which  is  useful 
for  goodness-of-f it  purposes  (and,  hence,  in  many  cases 
extremely  sensitive  to  outliers  or  general  discrepancies 
from  the  model  )  should  be  hoped  to  possess  any  desirable 
'robustness*  properties,  ’t  turns  out  that,  in  most 
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cases  .  .  .  uhile  the  discrepancy  measure  itself  may  be 
fairly  sensitive  to  the  presence  of  outliers,  the  value  .  .  . 
which  minimizes  the  discrepancy  ...  is  much  less  so  [ZT-S-Bl. 

Finally,  they  state  that  the  method  presents  a  trade-off  between 

efficiency  considerations  and  robustness  considerations  (28=179). 

In  addition  to  consistency,  robustness  and  efficiency, 

investigators  have  revealed  other  attractive  features  of  the  minimum 

distance  estimation  technique.  Parr  and  Schucany  indicate  that  "minimum 

distance  estimators  share  an  invariance  property  with  maximum  likelihood 

estimators  .  .  .  Tt  operates  in  a  manner  analogous  to  maximum  likelihood 

methods  in  simply  selecting  a  ‘best  approximating  distribution'  from 

those  in  the  model"  (27=9).  Additionally,  Parr  states  that  the  method 

is  very  easy  to  implement.  “Given  a  set  of  data,  a  parmetric  model,  and 

a  distance  measure  between  distribution  functions,  all  that  is  needed  is 

an  omnibus  minimization  routine  to  compute  the  estimator" 

(26=1207-1208).  Finally,  minimum  distance  estimators  provide  meaningful 

results  even  if  the  conjectured  parametric  model  is  incorrect. 

MD-est imat ion  still  provides  the  best  approximation  in  terms  of 

probability  units  with  regard  to  the  conjectured  distribution  (26=1208). 

"This  IS  a  feature  not  enjoyed  by  other  estimation  methods  such  as  the 

maximum  likelihood"  (26=1208).  Therefore,  fID-estimation  can  be  a  very 

useful  tool  for  the  statistician. 

The  minimum  distance  estimation  technique  uses  a  distance  measure 

and,  for  this  reason,  is  closely  linked  with  certain  goodness-of-f it 

tests.  As  explained  by  Stephens,  gocdness-of-f it  statistics  are 

"based  on  a  comparison  of  F(x)  with  the  empirical  distribution 

function  F^(a)"  (35=730).  In  a  goodness-of-f it  test,  one  is 

interested  in  fitting  an  empirical  distribution  function,  described 
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earlier,  with  a  fully  specified  <i.e.,  with  known  paramters) 
distribution  function.  The  test  for  whether  the  fit  is  'good'  is 
normally  a  measure  of  distance  oetween  the  two  distribution  curves.  In 
contrast,  minimum  distance  estimation  uses  a  parent  distribution  family 
with  certain  unknown  parameters.  The  estimates  of  the  unknown 
parameters  are  those  parameter  values  which  minimize  the  distance 
measure  between  the  empirical  distribution  and  the  parent  distribution 
being  investigated.  The  three  distance  measures  to  be  used  in  this 
study  are  described  next. 

Kolmogorov  D ..stance .  The  statistic  suggested  by  Kolmogorov  in 
1933  IS  the  largest  absolute  distance  between  the  graphs  of  the 
empirical  distribution  function,  S(x),  and  the  hypothesized 
distribution  function,  F(x.;0)  measured  in  the  vertical  direction 
(8:345).  Symbolically,  the  Kolmogorov  distance  (D>  is  given  by; 

D=suplF(x.;0)-S(x)l 

1  \  Z  •  D  ) 

which  reads  D  equals  "the  supremum,  over  all  x,  of  the  absolute 
value  of  the  difference  F(x^;0)  -  S(  x >  "  (8:347).  Stephens 

provides  a  computational  form  for  all  of  the  distance  measures  to  be 
used  in  this  study  where  he  lets  z^  =  F(x^),  i  =  l,Z,...,n  .  For 

the  Kolmogorov  distance,  the  computational  form  is  as  follows: 

=  max, ,  ,  ( ( i/n)  -  z  ] 
l(i{n  i 

D  =  max,,.,  Iz  -  (i-l)/n] 
l(i(n  1 

D  =  max  ( 0* , D  )  (1.1) 

(35-731).  These  computational  formulae  provide  the  maximum  distance 
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betueen  the  enpincal  distribution  function,  uhich  is  a  step  function, 
and  the  conjectured  distribution  function,  F(x^;0), 

Crawer-von  discs  Distance.  The  Craner-von  Mises  statistic  is 
actually  a  Member  of  the  Cramer- von  Mises  family  of  distance  measures 
uhi 'h  is  "based  on  the  squared  integral  of  the  difference  betueen  the 
EDF  and  the  distribution  tested: 


U"  = 


/. 


IF  (x> 
n 


-  F(x;e)}  p(x)  dx 


(Z.8) 


The  function  .  .  .  [3<x)l  .  .  .  gives  a  ueighting  to  the  squared 
difference"  <34:Z)-  The  Cramei — von  Mises  statistic  is  produced  by 
setting  the  ueighting  function  equal  to  one,  P< x >  =  1  {34:Z).  The 
computational  form  of  the  Craner-von  Mises  statistic  is  given  by 
Stephens  as  follous: 


-  (Zi  -  i)/Zn]^  +  (l/lZn)  „ 

1-1  1  (Z.9 

< 35-731).  This  formula  uses  the  sane  symbology  as  the  computational 
form  of  the  Kolmogorov  distance  measure. 

flnderson-Darl mo  Distance.  The  Anderson-Darling  distance 
measure  is  actually  another  member  of  the  Cramer-von  Mises  family.  In 
this  case,  houever,  the  ueighting  factor  is  l/{u( 1  -  u)>  uhere 
0  (  u  <  1  (Z7:4).  "This  ueight  function  counteracts  the  fact  that 

the  discrepancy  [in  Eq  Z.9]  betueen  F^(x)  and  F(x;0)  is  necessarily 
becoming  smaller  in  the  tails,  since  both  approach  0  and  1  at  the 
extremes  <34:Z).  Therefore,  the  Anderson-Darling  ueighting  function 
gives  "greater  importance  to  observations  in  the  tail  than  do  most  of 
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the  EDF  statistics"  (34-Z).  Stephens  gives  the  conputational  form  of 
the  ftnderson-Darling  statistic  as  follous^ 


-  •  I  '  <21  - 


ft  =  -  < 


1  =  1 


1)  [Inz.  +  In  (1  -  2  ,,  .)]  >/n  -  n 

1  n+l-i  (Z.10) 


(35:731).  ftgain,  this  conputational  fornula  uses  the  sane  synbology 


used  for  the  other  tuo  distance  neasures*  conputational  formulae. 


III.  Pareto  Distribut ion 


This  chapter  will  first  relate  the  history  of  the  Pareto 
distribution.  A  summary  of  various  socio-economic  and  military 
applications  uill  follow  this  historical  perspective.  Then  a  detailed 
description  of  the  Pareto  function  uill  be  presented.  Finally,  this 
chapter  will  describe  the  best  linear  unbiased  and  the  minimum  distance 
estimation  techniques  as  applied  specifically  to  the  Pareto  function. 


HISTORY 

In  1897  Oilfredo  Pareto  (1848-1923),  an  Italian-born  Swiss 
professor  of  economics,  formulated  an  empirical  law  which  bears  his  name 
(18:233).  Pareto’s  Law  was  based  on  his  study  of  the  distribution  of 
incomes  in  several  European  countries  during  the  nineteenth  century. 

The  mathematical  results  of  the  study  were  summarized  as  follows-' 


N  =  fix 


(3.1) 


where  N  is  the  number  of  people  haveing  incomes  equal  to  or  greater  than 

income  level  x.  A  and  c  are  parameters  where  c  is  sometimes  referred  to 

as  Pareto’s  constant  or  the  shape  parameter  (18=233).  Pigou  summarized 

Pareto’s  findings  in  the  following  statement: 

It  is  shown  that,  if  x  signify  a  given  income  and  N 
the  number  of  persons  with  incomes  exceeding  x,  and 
if  a  curve  be  drawn,  of  which  the  ordinates  are 
logarithms  of  x  and  the  abscissae  logarithms  of  N, 
this  curve,  for  all  the  countries  examined,  is 
approximately  a  straight  line,  and  is,  furthermore, 
inclined  to  the  vertical  axis  at  an  angle,  which,  in 
no  country,  differs  by  more  than  three  or  four  degrees 
from  56*.  This  means  (since  tan  58*  =  1.5)  that,  if 
the  number  of  incomes  greater  than  x  is  equa^  ^o  N,  the 
number  greater  than  mx  is  equal  to  I  N(l/m)  '  1, 


uhatever  the  value  of  n  nay  be.  Thus  the  scheme  of  income 
distribution  is  everyuhere  the  sane  129:647]. 

The  Pareto  premise,  then,  as  deduced  from  his  mathematical  findings  and 

stated  in  economic  rather  than  mathematical  terms  is  as  follous'- 

Hence,  uhat  this  thesis  amounts  to  in  effect  is  that, 
on  the  one  hand,  anything  that  increases  the  national 
dividend  must,  in  general,  increase  also  the  absolute 
share  of  the  poor,  and,  on  the  other  hand — and  this  is 
the  side  of  it  that  is  relevant  here — that  it  is  impossible 
for  the  absolute  share  of  the  poor  to  be  increased  by 
any  cause  uhich  does  not  at  the  same  time  increase  the 
national  dividend  as  a  uhole  .  .  .  uie  cannot  be  confronted 
uith  any  proposal  the  adoption  of  uhich  uould  both  make 
the  dividend  larger  and  the  absolute  share  of  the  poor 
smaller,  or  'vice  versa’  123:6481. 

Pareto  felt,  therefore  that  his  law  was  “universal  and 

inevitable — regardless  of  taxation  and  social  and  political  conditions* 

vl6:233). 

Since  the  statement  of  Pareto’s  Law,  several  renowned  economists 

have  refuted  the  law’s  sweeping  applicability  (lB-233).  In  particular, 

Pigou  identified  defects  in  its  statistical  basis,  arguing  that  the 

differences  in  inclination  of  the  plotted  lines  were  significant. 

Additionally,  he  argues  that  such  a  generalization  from  an  empirical 

study  under  certain  conditions  (certain  avenues  of  income  such  as 

inheritance  and  personal  effort)  cannot  justifiably  be  extended  to  all 

social  conditions  (29:649-655). 

The  general  defence  of  "Pareto’s  Law"  as  a  law  of  even 
limited  necessity  rapidly  crumbles.  His  statistics 
uarrant  no  inference  as  to  the  effect  on  distribution  of 
the  introduction  of  any  cause  that  is  not  already  present 
in  approximately  equivalent  form  in  at  least  one  of  the 
communities- -and  they  are  very  limited  in  range--from  uhich 
these  statistics  are  drawn.  This  consideration  is  really 
fatal i  and  Pareto  is  driven,  in  effect,  to  abandon  the 
uhole  claim  .  .  .  129:654-6551. 

Additionally,  Champernoune  identifies  weaknesses  in  the  Pareto  Law. 
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He  indicates  that  the  use  of  the  Pareto  constant  as  a  measure  of  income 
distribution  inequality  between  communities  suffers  from  two  problems. 
Firstly,  the  measure  only  addresses  income  before  taxation.  Secondly, 
the  measure  only  applied  to  income  distributions  among  the  rich  and 
breaks  down  when  applied  to  those  with  medium  incomes  (7'-G03>. 

Finally,  Fisk  discusses  the  value  of  the  Pareto  distribution 
regarding  its  ability  to  describe  distributions  of  income.  He  states 
that  the  “Pareto  curve  fits  income  distributions  at  the  extremities  of 
the  income  range  but  provides  a  poor  fit  over  the  whole  income  range" 

( 12:171 ). 

Therefore,  Pareto's  Law  with  regard  to  income  distributions  is  no 
longer  highly  touted.  However,  other  disciplines  have  found  application 
of  the  Pareto  distribution  to  be  very  useful. 

APPLICATIONS 

Socio-economic  Related  Applications.  Although  the  Pareto 
distribution  was  formulated  as  a  reflection  of  income  distribution,  the 
Pareto  distribution  has  proven  to  be  useful  in  many  other  areas  of 
investigation.  Johnson  and  Kotz  indicate  the  Pareto  distribution  can  be 
useful  in  describing  many  socio-economic  or  naturally  occuring 
quantities.  Examples  include  the  distributions  of  city  population 
sizes,  fluctuations  in  the  stock  market,  and  the  occurrence  of  natural 
resources.  The  Pareto  is  useful  in  these  areas  because  they  often 
display  statistical  distributions  with  very  long  right  tails  (16=242). 

Koutrouvelis  listed  some  additional  areas  where  the  Pareto 
distribution  had  successfully  been  used.  These  areas  include:  business 
mortality  rates,  worker  migration,  property  values  and  inheritance,  and 


service  tines  in  queues  (19:7). 

Johnson  and  Kotz  additionally  identified  the  area  of  personal 
incone  investigation  as  an  area  where  the  Pareto  distribution  was 
applicable  (16:242).  In  1982,  Uong  used  the  Pareto  in  his  analysis  of 
incone.  He  indicates  that  nany  individuals  underreport  their  true 
incones  to  avoid  a  portion  of  their  tax  paynents.  Uong  shows  the 
applicability  of  the  Pareto  in  reflect?  ig  this  underreporting  phenonena 
(40:1). 

Militarily  Related  ftoolications.  In  addition  to  socio-econonic 
interests,  the  Pareto  distribution  has  proven  useful  in  nany  areas  of 
interest  to  the  nilitary.  These  areas  include  fallout  nass-size 
distributions,  interarrival  tine  distributions,  and  failure  tine 
distributions.  This  section  will  address  each  of  these  areas  in  turn. 

E.  C.  Freiling  conducted  a  study  for  the  U.S.  Naval  R^'diological 
Defense  Laboratory  concerning  a  conparison  of  distribution  types  for 
describing  “the  size  distribution  of  particle  nass  in  the  fallout  fron 
land-surface  bursts"  (11:1).  In  this  study,  he  conpared  the  lognornal 
distribution  with  the  Pareto.  He  deternined  that  with  the  effects  of 
the  uncertainties  playing  in  the  problen,  the  differences  in  descriptive 
ability  of  the  two  distributions  were  trivial.  He  indicated  that  the 
lognornal  "has  the  esthetic  advantage  of  an  observat ionally  confirned 
theoretical  basis  in  the  case  of  airburst  debris*  (11:12).  However,  if 
truncation  is  required,  the  Pareto  distribution  has  "the  practical 
advantage  of  sinplifying  further  calculations  of  particle  surface 
distribution"  (11:12). 

A  Pareto  description  of  interarrival  tines  has  played  an  inportant 
part  in  two  other  studies,  one  involving  interarrival  tines  in  general 


and  the  second  involving  telephone  circuit  error  clustering.  Bell, 
Ahmad,  Park  and  Lui  performed  the  general  interarrival  tine  study 
supported  by  a  grant  from  the  Office  of  Naval  Research.  They  indicate 
that  interarrival  tine  distributions  are  usually  thick-tailed  as 
compared  to  Gaussian  or  Poisson  processes  for  like  distributions.  They 
state  that  the  Pareto  can  provide  a  variety  of  tail  thicknesses 
depending  on  the  value  of  the  shape  parameter  employed  (2:1).  In  the 
telephone  circuit  paper,  Berger  and  Mandelbrot  propose  a  new  model  to 
describe  error  occurrence  on  telephone  lines.  They  conclude  that  the 
Pareto  distribution  can  well  be  used  to  approximate  the  distribution  of 
inter-error  intervals. 

Finally,  the  Pareto  distribution  has  proven  useful  in  life  testing 
and  replacement  policy  situations.  Davis  and  Feldstein  show  the  Pareto 
as  a  competitor  to  the  Ueibull  distribution  with  regard  to  time  to 
failure  of  a  system  since,  “unlike  the  Ueibull,  it  does  not  give  rise  to 
infinite  hazard  at  the  origin  nor  hazard  increasing  without  bound" 
(9:30S).  Kaminsky  and  Nelson  illustrate  the  use  of  the  Pareto  in 
developing  replacement  policy.  The  Pareto  can  be  used  to  predict 
component  replacement  times  based  on  an  accumulation  of  early  failure 
data  (17=145). 

PARETO  FUNCTION 

The  mathematical  formulation  of  Pareto’s  Law  on  income  distribution 
13  shown  in  Eq  (3.1).  This  law  corresponds  to  the  following  Pareto 
probability  density  function  as  given  by  Johnson  and  Kotz: 

P(x)  =  Pr[X  >  xl  =  (a/x)*^  a>0,  c)0,  x)a  (3.2) 
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In  this  equation  P(x)  gives  the  probability  that  income  is  equal  to  or 
greater  than  x,  while  a  corresponds  to  some  nininun  income  (IB =234). 

The  cumulative  distribution  function  (cdf)  of  X  resulting  from  Eq  (3.2) 
gives  the  following  Pareto  distribution: 

F  (x)  =  1  -  (a/x)^  a>0,  c>0,  x^a  (3.3) 

A 

(IB =234).  During  Mandelbrot’s  investigation  concerning  the  Pareto 
distribution,  he  distinguishes  between  two  forms  of  the  Pareto  Law:  the 
Strong  Law  of  Pareto  and  the  Weak  or  Asymptotic  form  of  the  Law  of 
Pareto.  Mandelbrot’s  Strong  Law  of  Pareto  is  of  the  form  shown  in  Eq 
(3.3)  and  is  written  as  follows: 

1  -  F^(x)  =  (x/a)'*^  x>a 

=  1  x(  a  (3.4) 

Mandelbrot’s  Weak  or  Asymptotic  form  of  the  Pareto  Law  is  written  as 
follows: 

1  -  Fj^(x)  ~  (x/a)'*^  as  x  ->  w  (3.5) 

The  Weak  form  implies  that  if  the  log  of  the  left  side  of  the  relation 
is  graphed  against  log  x  ’the  resulting  curve  should  be  asymptotic  to  a 
straight  line  with  slope  equal  to  l-c]  as  x  approaches  infinity* 
(16=245). 

Grouping  Pareto  Distribut ions  bv  Kind.  There  are  several  versions 
of  the  Pareto  cumulative  distribution  function.  Often,  these  versions 
are  grouped  according  to  ’kind’.  There  are  three  labels  used  in  this 
type  of  grouping  scheme:  Pareto  distributions  of  the  first  kind,  of  the 
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second  kind,  and  of  the  third  kind. 


A  distribution  of  the  form  shoun  In  Eq  (3.3)  is  referred  to  as  a 
Pareto  distribution  of  the  first  kind  (16:234).  A  Pareto  distribution 
of  the  second  kind  is  uiritten  as  follous: 

F{x)  =  1  -  K/[(x  +  0*^1  (3.B> 

(16=234).  This  form  differs  from  the  Pareto  distribution  of  the  first 
kind  through  the  addition  of  another  quantity,  C,  in  the  denominator  of 
the  second  term  on  the  right  hand  side  of  the  equation. 

In  addition  to  the  two  distribution  kinds  above,  Pareto  suggested  a 
third  lau,  the  distribution  of  which  Mandelbrot  calls  a  Pareto 
distribution  of  the  third  kind.  The  mathematical  form  is  as  follows: 

F(x)  =  1  -  tk2e’^’‘/(x  +  C)*^]  (3.7) 

(16=234).  The  Pareto  distribution  of  the  third  kind  degenerates  to  that 
of  the  second  kind  when  h  =  0. 

Grouping  Pareto  Distributions  bv  Parameter  Number.  Perhaps  a  more 
understandable  method  of  grouping  the  various  forms  of  the  Pareto 
distribution  function  is  by  grouping  them  according  to  the  number  of 
parameters  the  form  contains.  However,  before  describing  these 
functions,  three  basic  parameters  will  be  defined. 

Hastings  and  Peacock  describe  three  types  of  parameters  which 
always  have  a  physical  or  geometrical  meaning.  These  three  parameters 
are  those  of  location  (a),  scale  (b)  and  shape  (c).  This  study  will  use 
this  symbology  when  using  these  parameters.  The  location  parameter,  a, 
15  "the  abscissa  of  a  location  point  (usually  the  lower  or  mid  point)  of 


28 


the  range  of  the  variate"  (15:20).  The  scale  parameter,  b,  “determines 
the  scale  of  measurement  of  the  fractile,  x"  (15:20).  A  fractile  is  a 
general  element  uithin  the  range  of  the  variate,  X  (15:5).  Finally,  the 
shape  parameter,  c,  “determines  the  shape  (in  a  sense  distinct  from 
location  and  scale)  of  the  distribution  function  (and  other  functions) 
uithin  a  family  of  shapes  associated  with  a  specified  type  of  variate" 
(15:20).  Using  the  normal  distribution  as  an  example,  the  mean  is  the 
location  parameter  because  it  specifies  a  kind  of  mid  point  for  the 
distribution.  The  standard  deviation  is  the  scale  parameter  because  it 
provides  a  fractile  measurment  device  for  the  distribution.  “The  normal 
distribution  does  not  have  a  shape  parameter*  (15:20).  With  this 
background  on  location,  scale  and  shape  parameters,  ue  can  now  proceed 
with  the  disussion  on  grouping  Pareto  distributions  according  to  the 
number  of  parameters  contained  in  the  distribution  expression. 

The  most  commonly  used  form  of  the  Pareto  distribution  is  the  two 
parameter  form:  however,  there  is  a  more  general  form  which  uses  all 
three  basic  parameters  of  location  (a),  scale  (b),  and  shape  (c).  This 
section  will  present  this  more  general  form  and  show  how  the  simpler 
forms  are  derived  from  it.  The  three  parameter  form  of  the  Pareto 
distribution  is  written  as  follows: 

F(x)  =  1  -  [1  +  (x-al/b]'*^  x>a  (3.8) 

where  b)0  and  a)0  (20=218).  As  stated  earlier,  the  notation  of 

Hastings  and  Peacock  is  used  in  this  equation  and  in  those  that  follow. 

The  two  parameter  Pareto  distribution  is  the  most  common  form  of 
the  distribution  and  is  derived  from  Eq  (3.8)  by  eliminating  either  the 
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location  or  the  scale  parameter  from  the  equation.  One  uay  to  obtain  a 
t'jo  parameter  distribution  function  is  to  set  the  location  parameter 
equal  to  zero.  For  a=0  ue  obtain  a  Pareto  distribution  of  the  second 

Q 

kind  as  shoun  in  Eq  (3.6)  uhere  K=b  and  C=b.  This  special  case  is 
sometimes  referred  to  as  the  Lomax  distribution  (Z0:Z18).  Another 
method  of  effectively  eliminating  one  of  the  parameters  is  to  set  the 
location  parameter  equal  to  the  scale  parameter.  Setting  a=b  in  Eq 
(3.8)  results  in  the  usual  formulation  of  the  Pareto  distribution  and  is 
the  Pareto  distribution  of  the  first  kind  as  shoun  in  Eq  (3.3). 

The  simplest  form  of  the  Pareto  distribution  is  the  one  parameter 
version  uhich  can  be  obtained  by  setting  both  the  location  and  the  scale 
parameter  equal  to  one.  Setting  a='b=l  in  Eq  (3.8),  the  follouing 
ditribution  function  results: 

F(x)  =  I  -  x>l  (3.9) 

This  one  parameter  form  is  regarded  as  the  'standard  form’  of  the  Pareto 
distribution  (18:240). 

Since  most  of  the  many  versions  of  the  Pareto  distribution  can  be 
derived  from  the  more  general  three  parameter  model,  this  thesis 
investigates  the  three  parameter  distribution.  This  should  ensure  that 
results  of  this  study  can  be  used  in  a  wider  variety  of  applications 
uhere  estimation  is  required. 

PARAMETER  ESTIMATION 

This  section  describes  the  estimation  methods  used  in  this  study  as 
applied  specifically  to  the  Pareto  distribution.  First  the  best  linear 
unbiased  estimators  are  presented  along  with  the  procedure  used  to 
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transform  these  estimators  into  a  computational  form.  Then  the  minimum 
distance  estimation  formulas  uill  be  adapted  to  the  Pareto  distribution. 

Best  Linear  Unbiased  Estimator.  As  was  mentioned  by  Kulldorff  and 
Uannman,  the  general,  three  parameter  form  of  the  Pareto  cumulative 
distribution  function  has  received  little  attention  from  statisticians 
working  on  the  development  of  estimators  (20-218).  Hence,  many 
estimators  have  been  developed  for  special  cases  of  the  two  parameter 
formulation  while  few  estimators  are  available  for  study  of  the  more 
general  distribution  form. 

Kulldorff  and  Uannman  successfully  derived  BLU  estimators  for  three 
cases  of  the  general  Pareto  distribution  where  the  shape  parameter  is 
always  assumed  to  be  greater  then  two.  Specifically,  these  cases  are: 
scale  parameter  when  the  location  and  shape  are  known:  location 
parameter  when  the  scale  and  shape  are  known:  and  location  and  scale 
parameters  when  the  shape  is  known  (20:218-224).  The  estimators 
developed  for  the  third  case  are  the  estimators  used  in  this  study, 
since  only  shape  parameters  uill  be  explicitly  specified  for  the  Pareto 
distribution  being  investigated.  However,  these  estimators  are  useful 
only  when  c>2  . 

USnnman  later  presented  the  BLU  estimators  for  the  same  three 
cases  shown  above  with  the  condition  that  the  shape  parameter  is  equal 
to  or  less  than  two  (36=704).  Therefore,  his  estimators  uill  be  used 
for  the  cases  when  c(Z  . 

BLUEs  for  Shape  Greater  Than  2.  As  stated  earlier, 

Kulldorff  and  Vannman  developed  BLU  estimators  for  both  the  location 
and  scale  parameters  with  the  shape  parameter  known  and  greater  than 
two.  From  Chapter  II.  we  recall  that  the  BLU  estimator  is  based  on 
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order  statistics  where  the  random  variables  are  arranged  in  order  of 

magnitude  from  smallest  to  largest  (24;2Z9).  Therefore,  the  elements 

of  the  drawn  sample  are  ordered  from  smallest  to  largest  to  provide  the 

order  statistics  where  x, , .  <  x._.  <  ...  <  x,  ,  .  Here  x,,.  is  the 

<1)-  (2)-  -  (n)  (1) 

smallest  valued  observation  and  x,  ,  is  the  largest  valued  observation 

<  n) 

from  the  sample  of  size  n.  Since  a  BLU  estimator  must  take  the  form  of 
a  linear  combination  of  the  ordered  random  variables,  the  BLU  estimators 
for  the  location  and  scale  parameters  of  the  Pareto  distribution  with 
specified  shape  must  be  a  linear  combination  of  the  ordered  sample 
observations,  where  the  coefficients  of  these  observations  are  to  be 
determined.  In  developing  their  BLU  estimators,  Kulldorff  and  Uannman 
derived  this  linear  relationship  and  determined  the  coefficients  which 
are  based  on  the  sample  size  and  the  specified  shape  parameter.  The  BLU 
estimators  for  location,  a,  and  scale,  b,  are  written  as  follows: 

A 

a  =  ^ )  "  Y/l  (nc-l)(nc-Z)  -  ncD  1  (3.10) 

A 

b  =  Y(nc-l)  /  [  (nc-lXnc-Z)  -  ncD  ] 

=  <  nc-l  )[x^  j  a)  (3.11) 

The  authors  note  that  in  the  speci  .1  case  when  a=b  ,  the  BLU  estimator 
reduces  to  the  following- 

A 

a  =  (  1  -  (1/nc)  1  x^^^  (3.12) 

Equations  (3.10)  and  (3.11)  both  contain  two  quantities,  Y  and  D, 
which  still  need  to  be  defined.  Y  is  defined  in  terms  of  D  and  an 
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additional  neu  tern  B  ,  uhile  D  sinply  contains  the  neu  tern  B  .  B 

1  11 

IS  defined  in  terns  of  the  sanple  size,  n,  and  the  specified  shape 
paraneter,  c.  Therefore,  by  conputing  the  B^  terns,  both  D  and  Y  can 
be  deternined.  With  D  and  Y  knoun,  one  can  then  calculate  the  BLU 
estinators  of  ocation  and  scale.  The  expressions  for  Y,  0  and  B^  are 
as  follous: 


n-1 

Y  =  (c+1)  I.  ,  B.x,  ,  +  (c-l)B  X,  ,  -  Dx,,,  (3.13) 

1  =  1  1  ( 1 )  n  (  n>  ( 1 ) 

n-1 

D  =  <c+l)  I.  ,  B.  +  (c-DB,  ,  (3.14) 

1-1  1  (n) 

T~ (n-i+1)  T~ (n+l-2/c) 

B^  =  i  =  1,2,  .  .  .  ,n  (3. IB) 

1  (n-i+l-2/c)  1  (n+l) 

Equations  (3.10)  to  (3.15)  are  the  Kulldorff  and  Vannnan  equations 
(20:219-225). 

To  obtain  the  BLU  estinators,  ue  nust  calculate  all  of  the  B 

i 

values  for  i=l,2,...,n  .  Eq  (3.15)  shows  that  B  contains 

i 

four  ganna  functions  which  would  require  considerable  conputational 
tine;  however,  the  expression  can  be  sinplified  to  reduce  the 
conputational  load. 

Banks  and  Carson  indicate  that  "the  ganna  function  can  be  thought 
of  as  a  generalization  of  the  factorial  notion  which  applies  to  all 
positive  nunbers,  not  just  integers"  (1:144).  They  show  that  for  any 


positive  real  nunber,  p,  the  ganna  function  of  p  is  as  follows: 


(p)  =  (p-l> 


F  (p-i 


( 3.16) 


Since  I  (1)  =  1  ,  uie  see  that  if  p  is  an  integer  than  Eq  (3.16) 

reduces  to  (1:144): 


(p)  =  (p-D! 


(3.17) 


Fq  (3.16)  and  Eq  (3.17)  uill  be  used  to  simplify  the  tern  to  a  more 
manageable  computational  form.  The  first  gamma  function  in  the 
numerator  and  the  last  gamma  function  in  the  denominator  uill  aluays  be 
gamma  functions  of  integer  values;  therefore,  Eq  (3.17)  can  be  used  to 
transform  these  terns  to  common  factorial  terns.  Eq  (3.16)  uill  be 
used  on  the  renaming  gamma  function  in  the  numerator  to  assist  in  the 
reduction  process.  Simplification  of  the  first  tuo  terms  (30) 
uill  reveal  a  pattern  uhich  uill  simplify  the  evaluation  process: 


T~ (n-1  +  1)  F (n+l-Z/c) 

F(n-l  +  l-2/c)  I  (n+1) 


F(n)  F (n+l~Z/c) 
F(n-Z/c)  F(n+1) 


(n-l)i  (n-Z/c) 


F(n-) 


n  (  n-l> !  F(  n-j 


=  ( n  -  Z/c)  /  n 


1  -  Z/(cn) 


( 3.18) 
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Solving  for  8^ 


in  a  sinilar  manner  yields  the  following: 


Fcn-Z.!)  F { n+l-2/c> 


F (n-Z+l-Z/c)  F (n+1) 
F(n  -1)  F <n+l-Z/c) 


F(n-  1-Z/c)  F(n+1) 

(n-Z)!  (n-Z/c)  F(n-2/c) 


n!  F (n-l-Z/c) 


(n-Z)!  (n-Z/c)  (n-l-Z/c)  F(n-l-Z/c) 

n  (n-1)  (n-Z)i  F (n-l-Z/c) 

(n-  Z/c)  (n-  1  -  Z/c) 
n  (  n  -  1 ) 


=  t  1  -  Z/(cn)  J  (  1  -  Z/c(n-l)  1  (3.19) 

Equations  (3.18)  and  (3.19'  reveal  the  following  pattern  for  the  B  . 

n 

B  =  11  -  Z/cn]  [1  -  Z/c(n-l)]  ...  [1  -  Z/c(l)]  (3.Z0) 

n 

The  following  notation  will  allow  even  further  simplification  of  the  B 
value  computations: 
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Let 

t^  =  Z/c(n) 

1 

t^  =  Z/c(n-l) 

.  .  .  .  ,  t  =  Z/c(l) 
n 

Let 

1 

□z  =  1  -  tz  , 

.  .  .  ,  u  =  1  -  t 
n  n 

Then 

Bj  =  Ui  , 

®Z 

=  u^u^  .  .  . 

,  B  =  u,u„. . .u  . 
n  1  Z  n 

find 

in  general , 

the 

computational 

form  IS  as  follous: 

i 

B. 

1 

=  IT 

j=l 

uhere  =  1  -  and  =  Z/c<n-j+l)  for  j  =  1,  Z,  .  •  .  ,  i 
(30).  Equipped  uith  these  relations,  ue  can  nou  urite  the 
follouing  recursive  relationship  uhich  uill  allou  simpler  calculations 
as  recommended  by  Vannman  (36:705): 

B.  =  [1  -  Z/c(n-i+l)J  8.  ,  i  =  1,Z . n  (3.ZZ) 

With  these  relationships  available,  the  programming  of  these 
calculations  uill  be  much  simpler. 

BLUEs  for  Shane  Equal  to  or  Less  Than  2.  Vannman  indicates 
that  the  variance  of  the  Pareto  distribution  does  not  exist  uhen  the 
shape  parameter,  c,  is  equal  to  or  less  than  Z;  therefore,  the  above 
formulas  for  BLU  estimators  of  location  and  scale  do  not  apply.  He 
further  states,  houever,  that  if  only  the  first  k  order  statistics  are 
used  in  the  estimator,  uhere  Z  (  k  (  (n+1  -  Z/c)  ,  then  the  variance 
of  the  estimator  does  exist,  uith  the  added  condition  that  the  shape 
parameter  satisfies  the  follouing  relationship:  Z/n  <  c  (  Z  .  He 
indicates  that  the  most  efficient  estimator  is  obtained  by  basing  the 
estimator  on  the  first  k  order  statistics  uhere  k  =  n  -  [Z/cl.  In 
this  equation  the  bracketed  fraction  implies  that  only  the  integer 
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portion  of  the  fraction  is  used  in  the  calculation  (36:705-707).  The 
formulas  for  the  location  and  scale  parameters  based  on  the  first  k 
order  statistics  are  as  follous  (36:707): 


and 


uhere 


a.  =  X,  ,  -  b. /(nc-1)  (3.23) 

k  ( 1 )  k 

k-1 

b,  =  (1/U,  )  {  (c+1)  I  B.  X,  .  . 

k  k  1  (i) 

+  [  (n-k+l)c  -118,  X,,  . 

k  ( k ) 

-t  (nc-l)/(nc)  ]  (nc  -  2  -  U,  )  X, ,  ,  >  (3.24) 

k  ( 1 ) 

(nc-2)  (nc-c-2)  -  net  (n-k)c  -  2  IB, 

k 

’^k  ”  (nc-1)  (c+2>  (3.25) 


Bgain,  Equations  (3.23)  and  (3.24)  can  only  be  used  uhere  k  represents 
the  first  k  order  statistics  and  uhere  k  (  n  +  1  -  2/c  .  To  obtain 

the  most  efficient  BLU  estn'.ator ,  Vannman  indicates  that  k  should 
additionally  staisfy  the  follouing:.  k  =  n  -  [2/cl  .  He  further  states 

that  in  the  case  uhere  2/c  is  already  an  integer  value,  then  eq  (3.24) 
simplifies  to  the  follouing  (36:707); 


b 


(c+1)  (c+2)  (nc-1) 
(nc-2)  (nc-c-2) 


n-2/c 


i=l 


B 

1 


(  i  ) 


((nc-2)/(c+2)l  X  > 

(3.26) 


Eq  (3.26)  can  then  be  entered  into  Eq  (3.23)  to  obtain  the  BLU  estimator 
for  location.  Houever,  to  use  eq  (3.26),  the  simplified  version  of  the 
BLU  estimator  of  the  scale  parameter,  four  conditions  must  exist: 
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1)  The  shape  parameter,  c,  must  be  specified 

Z)  Z/n  <  c  <  Z 

3)  Z/c  is  an  integer 

4)  Z  <  k  =  n  -  Z/c 

Finally,  V^nnman  notes  tuo  simplified  expressions  for  uhen 

the  shape  parameter  equals  1  or  Z.  He  indicates  that  if  c  =  1  , 

then  B  =  (1  -  i/n)  (  1  -  (i-l)/n  1.  If  c  =  Z  ,  then  B.  =  1  -  i/n 
i  1 

<36=705).  The  computer  program  verification  and  validation  phase 
of  this  research  revealed  that  Uannman’s  simplified  expression  for  B 
uhen  c  =  1  uas  incorrect  in  the  published  reference.  By  setting 
c  °  1  and  simplifying  Eq  (3.15),  the  error  in  Udnnman’s  published 
formula  uas  found.  To  generate  correct  B  values  uith  c  *  1  , 

Vannman’s  bracketed  tern  t  1  -  < i-1 )/n  1  must  be  changed  to 
[  1  -  i/<n-l)  1.  These  simplified  expressions  uill  be  valuable  in  the 
computer  programming  phase  of  this  study,  since  B  values  must  be 
calculated  to  determine  the  BLU  estimates. 

Minimum  Distance  Estimator.  The  general  computational  forms  of 
the  three  distance  measures  used  in  this  study  are  presented  in  Chapter 
II  and  are  reflected  in  equations  (Z.7>,  <Z.9>  and  <Z.10).  To  apply 
these  measures  using  a  Pareto  distribution,  ue  simply  substitute  our 
hypothesized  Pareto  distribution  function,  P^,  for  the  z^  value 
currently  shoun  in  these  equations,  uhere  the  starting  point  for  the 
estimates  of  location  and  scale  uill  be  the  BLU  estimates.  This 
hypothesized  Pareto  cdf  can  be  written  as  follous: 

A  A  *  ^  — 

P.  =  F(x,  ,;a,b,c)  =  1  -  [1  +  (x,  -  a)/b]'  (3.Z7) 

1  ( 1 )  ( 1 ) 
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The  minimization  routine,  ZXHIN,  from  the  the  International  Mathematical 


Statistics  Library  (IMSL)  uill  then  alter  the  values  of  location  and 
scale  to  obtain  the  minimum  distance  measure  values.  These  altered 
estimates  for  location  and  scale  then  become  the  minimum  distance 
estimates  for  that  particular  distance  measure.  The  procedural  details 
are  covered  in  more  depth  in  the  follouing  chapter. 


IV. 


Monte  Carlo  ftnalvsis 


This  chapter  will  describe  the  specific  analysis  tool  used  in  this 
study  to  compare  the  best  linear  unbiased  and  the  nininun  distance 
estimation  techniques.  The  tool  is  called  Monte  Carlo  analysis. 
Following  a  general  discussion  of  the  Monte  Carlo  method,  the  specific 
application  of  the  method  in  this  study  will  be  described.  This 
application  description  will  present  the  three  step  process  of  Monte 
Carlo  analysis  along  with  the  detailed  procedures  involved  within  each 
step . 


MONTE  CARLO  METHOD 

The  Monte  Carlo  method,  or  the  method  of  statistical  trials  (B:l), 

falls  within  the  realm  of  experimental  mathematics.  Hammersley  and 

Handscomb  indicate  that  the  essential  difference  between  theoretical  and 

experimental  mathematicians  “is  that  theoreticians  deduce  conclusions 

from  postulates,  whereas  experimentalists  infer  conclusions  from 

observations"  <13=1).  Monte  Carlo  analysis  is  a  member  of  the 

experimental  mathematics  branch  since  it  deals  with  mathematical 

experiments  on  random  numbers  (13:2).  A  further  exp  lane. ion  of  the 

Monte  Carlo  method  is  provided  by  Schreider: 

The  Monte  Carlo  method  (or  the  method  of  statistical 
trials)  consists  of  solving  various  problems  of 
computational  mathematics  by  means  of  construction 
of  some  random  process  for  each  such  problem,  with 
the  parameters  of  the  process  equal  to  the  required 
quantities  of  the  problem.  These  quantities  are  then 
determined  approximately  by  means  of  observations  of 
the  random  process  and  the  computation  of  its  statistical 
characteristics,  which  are  approximately  equal  to  the 
required  parameters  [6=11. 
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This  description  of  the  Monte  Carlo  method  reflects  hou  well  suited  the 
method  is  for  this  particular  study,  since  the  description  mirrors  the 
process  used  to  compare  the  two  estimation  techniques. 

MONTE  CftRLQ  STEPS  AND  PROCEDURES 

This  study  uses  a  three  step  Monte  Carlo  process  to  compare  best 
linear  unbiased  estimation  with  minimum  distance  estimation  (using  three 
distinct  distance  measures)  as  applied  to  the  Pareto  distribution. 

First,  one  generates  random  variates  from  a  specified  Pareto 
distribution  (i.e.,  a  Pareto  distribution  uith  knoun  parameters). 

Second,  the  tuo  estimation  techniques  are  used  to  obtain  parameter 
estimates  based  on  the  random  sample  data  from  the  first  step.  Third, 
the  resulting  estimates  are  compared  to  determine  which  estimation 
technique  provided  the  better  parameter  estimates  (4=27). 

Step  1=  Data  Generation.  Using  the  Monte  Carlo  technique,  we 
generate  our  own  random  data  using  the  random  number  generator  of  the 
VAX  11/785  (VMS)  computer  system  located  at  the  Air  Force  Institute  of 
Technology,  Wright-Patterson  Air  Force  Base,  Ohio.  A  random  number 
generator  generates  rendom  numbers  uniformity  distributed  on  [0,11 
(1=293).  Parr  stated  that  there  were  four  items  required  to  perform  a 
minimum  distance  estimation:  a  set  of  data,  a  parametric  model,  a 
distance  measure,  and  a  minimization  routine  (26=1207-1208).  The  data 
generation  step  supplies  the  first  two  items  by  generating  the  data 
based  on  a  specified  parametric  model,  the  Pareto  distribution. 

In  the  first  step,  the  researcher  generates  the  ranoom  sample  data 
needed  to  create  the  controlled  environment,  using  different  parameter 


values  for  each  data  set.  To  evaluate  the  effect  of  sample  size  on  the 


estimators  and  ensure  validity,  sample  sizes  (n)  of  B,  9,  12,  15,  and  18 
are  used.  Additionally,  shape  parameters  (c)  of  1.0,  2.0,  3.0,  and  4.0 
are  used  with  the  location  parameter  (a)  set  to  1  and  the  scale 
parameter  (b)  set  to  1  for  each  sample  size  resulting  in  20  total  data 
sets.  The  random  sample  data  required  for  the  study  are  random  variates 
from  a  specified  Pareto  distribution.  Previous  thesis  students  had  used 
distributions  for  which  computer  programs  were  already  available  to 
generate  random  variates  using  subroutines  from  the  International 
Mathematical  Statistics  Library  ( IMSL)  (4=27:  18:43).  However,  IMSL 
does  not  contain  a  similar  subroutine  for  the  Pareto  distribution. 
Therefore,  the  random  variate  relationship  was  derived  using  the  inverse 
transform  technique  (1:294-295)  on  the  general  three  parameter  Pareto 
distribution  function  shown  in  Eq  (3.8)  with  location  parameter  of  1  and 
scale  parameter  of  1.  The  derivation  of  the  Pareto  random  variate 
relationship  begins  by  substituting  a®!  and  b=i  into  Eq  (3.8)  which 
yields  the  following: 


F(x)  =  1  -  (l/x)*^  (4.1) 


Letting  R  be  a  random  number  between  0  and  1  and  letting  X  be  the  random 
variate,  we  have: 


R  =  1  -  (l/X) 


(4.Z) 


Solving  for  X  yields  the  Pareto  random  variate  relationship: 


X  =  (1/R) 


1/c 


(4.3) 


For  each  of  the  20  data  sets,  1000  samples  are  generated  where  each 
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data  set  is  characterized  by  a  unique  sample  size  (n  =»  6,  9,  12,  15,  or 
18)  and  shape  parameter  (c  1.0,  2.0,  3.0,  or  4.0)  uith  location 
parameter  and  scale  parameter  set  equal  to  1.  Therefore,  a  total  of 
20000  random  sample  sets  are  generated,  since  20  separate  data  sets  are 
required  to  reflect  all  the  combinations  of  sample  sizes  and  shape 
parameters.  Previous  studies  also  used  1000  samples  to  evaluate  the 
estimation  techniques  (4=28;  18:43).  A  computer  subroutine,  PARUAR,  uas 
written  to  generate  the  20000  random  sample  sets  from  the  three 
parameter  Pareto  distribution.  The  IMSL  subroutine  USRTA  was  used  on 
each  sample  set  of  size  n  to  arrange  the  random  variates  from  smallest 
to  largest.  The  output  was  then  used  by  each  of  the  estimation 
technique  subroutines. 

Step  Z-  Estimate  Computation.  The  second  step  of  the  Monte  Carlo 
process  is  to  use  both  of  the  estimation  techniques,  best  linear 
unbiased  and  minimum  distance  estimation,  to  compute  estimates  based  on 
the  random  sample  data  sets.  We  first  present  the  procedures  used  for 
finding  the  best  linear  unbiased  estimates.  This  presentation  is 
followed  by  the  minimum  distance  estimation  procedures. 

Using  each  of  the  data  sets  along  with  the  best  linear  unbiased 
estimators  for  the  location  and  scale  parameters  of  the  Pareto 
distribution  function  for  each  data  set,  one  obtains  1000  best  linear 
unbiased  estimates  for  the  parameters  of  each  particular  Pareto 
distribution  sampled.  The  computer  subroutines  written  to  perform  this 
task  were  titled  BLC6T2  and  BLCLE2.  These  subroutines  were  eventually 
run  against  all  20  data  groups. 

The  minimum  distance  estimation  process  develops  six  minimum 
distance  estimators  using  the  'BLUE’  estimates  of  location  and  scale 
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for  each  sample  of  size  n  as  hhe  starting  values  for  the  hypothesized 

A  A 

distribution  function,  F(x^;a,b,c),  uhich  in  our  computational  notation 
IS  equal  to  z^.  The  IMSL  minimization  subroutine,  ZXMIN,  then 
minimizes  the  computational  form  of  each  distance  measure  in  turn.  For 
instance,  by  varying  the  value  of  the  location  parameter  uhile  holding 
the  scale  equal  to  the  BLUE  for  scale,  ZXMIN  finds  the  value  of  the 
location  parameter  uhich  minimizes  the  distance  between  the  hypothesized 
distribution  and  the  empirical  distribution  function  for  each  sample  of 
size  n.  This  new  value  for  the  location  parameter  is  the  single 
parameter  minimum  distance  estimate  of  the  location  parameter. 
Alternatively,  by  holding  the  location  parameter  equal  to  the  BLUE  for 
location,  ZXMIN  uses  the  same  procedures  to  obtain  a  single  parameter 
minimum  distance  estimate  for  the  scale  parameter.  Finally,  ZXMIN  finds 
what  ue  call  a  double  parameter  minimum  distance  estimate  by  varying 
both  the  location  and  the  scale  parameters  in  the  same  minimization 
calculation.  The  result  of  a  double  parameter  minimum  distance  estimate 
run  IS  a  simultaneous  estimate  of  both  location  and  scale.  The  two 
single  parameter  minimization  techniques  (i.e.  one  for  location  and  one 
for  scale)  along  with  the  double  parameter  minimization  technique  are 
applied  to  each  of  the  three  distance  measures,  resulting  in  IZ  minimum 
distance  estimates  for  each  data  set  generated.  The  computer 
subroutines  written  to  perform  these  tasks  are  KSMD,  KSAMD,  and  KSBMD 
for  the  Kolmogorov  distance  measure.  For  the  Cramer-von  Mises  distance 
measure,  the  subroutines  CVMD,  CVflMD,  and  CVBMD  were  written.  Finally, 
for  the  Anderson-Darling  distance  measure,  the  subroutines  written  ari. 
entitled  ADMD,  ADBMD,  ADABMD.  The  source  code  for  these  subroutines  is 


located  in  Appendix  B.  Each  of  these  subroutines  is  run  against  all  Z0 


data  groups. 


Step  3:  Estimate  Comparison.  The  third  and  fmal  step  in  the 

Monte  Carlo  analysis  is  estimate  comparison.  In  this  step,  the  mean 

square  error  (MSE)  approach  is  used  to  evaluate  which  estimation 

technique  provides  more  accurate  parameter  estimates  (4:31). 

Many  statisticians  support  the  use  of  MSE  as  a  good  evaluation 

for  comparing  estimators.  Mendenhall  and  Scheaffer  state  that 

■'  Z 

MSE  is  the  expected  value  of  (0-6)  .  They  further  indicate  that  the 

mean  square  error  can  be  written  as  the  suiimation  of  the  variance  and 

the  square  of  the  bias  of  an  estimator  (Z4:ZB7).  Since  we  seek 

unbiased  and  relatively  efficient  estimators,  small  MSE  values  should 

provide  a  good  indication  of  estimators  possessing  these  two  desirable 

properties  and  should  therefore  provide  a  good  estimator  comparison 

tool.  Mendenhall  further  describes  a  method  for  evaluating  an  estimator 

which  parallels  the  method  used  in  this  study: 

Thus  the  goodness  of  a  particular  estimator  could  be 
evaluated  by  testing  it  by  repeatedly  sampling  a  very 
large  number  of  times  from  a  population  where  the 
parameters  were  known  and  obtaining  a  distribution  of 
estimates  about  the  true  value  of  the  parameter.  This 
distribution  of  estimates  would  be  referred  to  as  the 
'distribution  of  the  estimator’  .  .  .  Those  estimators 
possessing  distributions  that  grouped  most  closely  about 
the  parameter  would  be  regarded  as  'best'  .  .  .  Hence, 
the  relative  'goodness’  of  estimators  may  be  evaluated  by 
comparing  their  biases  and  their  variances  [Z3: 14-151. 

Since  the  MSE  is  a  function  of  both  the  variance  and  the  bias,  an  MSE 

comparison  should  reflect  the  goodness  of  the  estimators  considered,  as 

suggested  by  Mendenhall.  However,  Mood  and  Braybill  warn  that  "except 

in  trivial  cases,  there  is  no  estimator  whose  mean-squared  error  is  a 

minimum  for  all  values  of  0"  (Z5:1G7).  That  is,  for  a  given  0  value, 

estimator  A  may  produce  the  smallest  MSE,  while  for  another  value  of 

the  parameter,  estimator  B  may  provide  the  smallest  MSE.  However,  Mood 


and  Graybill  do  concede  that  the  MSE  does  provide  a  useful  guide  for 

estimator  comparisons  !n  fact,  they  do  end  up  using  the  MSE  as  their 

guide  in  searching  for  minimum  risk  estimators.  Finally,  Liebelt 

provides  tuo  reasons  uhy  minimizing  the  average  mean  square  error  is  a 

credible  criteria  for  evaluating  good  estimators^ 

First,  if  the  mean  square  error  is  zero  or  near  zero 
then  the  dispersion  of  the  estimate  from  the  true  value 
IS  also  zero  or  near  zero.  Secondly,  the  choice  of 
minimizing  the  avereige  mean  square  error  is  an  easy 
mathematical  procedure,  uheneas  other  choices  often 
lead  to  insurmountable  analytical  difficulty  CZ1:137]. 

Therefore,  many  authors  support  the  use  of  comparative  mean  square 

errors  as  a  valid  technique  for  evaluating  the  relative  uorth  of 

estimators,  uhere  the  estimator  with  the  smallest  MSE  is  considered  the 

'best’  estimator  for  a  given  set  of  parameter  values. 

The  term  mean  square  error  is  very  descriptive  of  the  procedures 

used  durinr  the  evaluation.  The  'error'  from  each  of  the  1000  samples  of 

size  n  IS  found  by  subtracting  the  estimated  parameter  from  the  true 

population  parameter.  This  error  term  is  then  squared,  giving  the 

'square  error.’  Finally,  the  mean  of  the  1000  'square  error’  terms  is 

found  by  summing  these  terms  and  dividing  by  1000,  tnereby  producing  a 

'mean  square  error'  (4=32).  The  estimator  providing  the  smallest  MSE, 

therefore,  is  the  best  estimation  technique  to  use.  The  formula  for 

calculating  the  MSE  is  as  follows:- 

MSE(0)  =  [  I.  ,(0  -  0)^]/N 

1-1  1  (4.4) 

A 

"uhere  0  is  the  true  value  of  the  parameter,  0^  is  the  ith  estimate, 
and  N  is  the  number  of  times  the  estimation  is  performed — in  this 
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analysis,  N  =  1000  ”  (4:32).  In  this  case,  the  parameters  being 

evaluated  are  the  location  and  scale  parameters.  Of  course,  the 
computer  was  used  to  perform  the  MSE  calculations  because  of  the  large 
number  o^  calculations  involved.  The  MSE  calculations  are  embedded  in 
the  main  program,  BLUMD,  thereby  eliminating  the  need  to  store  large 
numbe''s  of  variate  and  estimate  values.  The  MSE  calculations  result  in 
seven  estimation  error  indicators  for  the  location  parameter  and  seven 
estimation  error  indicators  for  the  scale  parameter  for  every  specified 
Pareto  distribution  considered.  The  seven  estimation  error  indicators 
for  each  parameter  correspond  to  the  seven  estimation  techniques  used: 
the  best  linear  unbiased  estimator  and  the  six  minimum  distance 
estimators.  The  estimation  technique  which  reflects  the  smallest  MSE  is 
considered  the  best  parameter  estimation  t'-.  ‘'nique  for  that  specified 
Pareto  distribution. 

The  subroutines  described  above  in  the  three  step  Monte  Carlo 
analysis  were  merged  into  a  computer  program  (BLUMD)  which  output  the 
MSE  values  for  the  estimation  techniques  being  compared.  The  source 
code  IS  found  in  Appendix  B.  The  logical  steps  or  pseudocode  for  the 
program  is  listed  in  Figure  1. 

Each  of  the  subroutines  in  BLUMD  were  validated  and  verified 
individually  by  comparison  with  sample  hand  calculations.  Additionally, 
the  subroutines  were  again  validated  and  verified  as  they  were  added  to 
the  parent  program.  It  was  this  validation  and  verification  procedure 
which  first  indicated  there  were  possible  problems  with  Uannman’s 
published  B  value  formula  which  supported  the  generation  of  the  best 
linear  unbiased  estimates  for  c®l.  Chapter  3  identifies  the  published 
version  of  the  B  value  formula  and  the  correction  required. 


Steps  in  BLU  vs  MD  Est mate  Cowparison 


1.  Generate  a  sample  set  of  n  random  variates  trom  a  Pareto 
distribution  uith  location  and  scale  equal  to  1  and  shape  equal  to 

c. 


Z.  Order  sample  from  smallest  to  largest. 

3.  Calculate  BLU  estimates  for  location  and  scale  based  on  sample 
size  n. 

4.  Calculate  the  Kolmogorov  minimum  distance  estimates  of 
location  and  scale  based  on  the  sample. 

5.  Calculate  the  Cramer-von  Mises  minimum  distance  estimates  of 
location  and  scale  based  on  the  sample. 

B.  Calculate  the  ftnderson-Darling  minimum  distance  estimates  of 
location  and  scale  based  on  the  sample. 

7.  Find  the  error  from  the  true  value  of  1  for  each  estimate  and 
square  this  error.  Save  a  running  sum  of  the  squared  error  terns 
for  each  es.inate. 

8.  Repeat  steps  1-7  1000  times  for  a  given  n. 

9.  Divide  all  eight  squared  error  totals  by  1000  to  give  the 
USE’S. 

10.  Output  the  14  MSE's  for  the  given  n  and  c  values. 

11.  Repeat  steps  1-10  using  a  different  sample  size,  n,  but  the 
sane  c,  until  all  values  of  n  have  been  used. 

IZ.  Repeat  steps  1-11  using  a  neu  shape  value,  c,  until  all 
values  of  c  have  been  used. 


Figure  1.  Pseudocode  for  Program  BLUtID 


48 


U.  Results,  ftnalvsis  and  Conclusions 

Figure  1  in  Chapter  I<J.  described  the  pseudocode  of  the  computer 
program  used  to  generate  each  Pareto  random  variate  sample  set, 
calculate  the  best  linear  unbiased  and  the  six  new  minimum  distance 
estimates  for  each  parameter  based  on  each  sample  set,  and  finally 
determine  the  mean  square  error  for  each  estimate.  This  chapter 
presents  the  results  of  the  computer  program  runs  along  with  an  analysis 
of  these  results.  Appendix  A  contains  the  results  of  the  study  in  table 
format,  where  a  separate  table  of  Mean  Square  Error  values  is  presented 

for  each  unique  shape  parametei - sample  size  combination  investigated. 

Since  there  were  20  possible  combinations  of  shapes  and  sample  sizes. 
Appendix  A  contains  20  separate  tables.  Finally,  this  chapter  presents 
the  conclusions  drawn  from  the  analysis  of  these  results. 

RESULTS 

Appendix  A  contains  the  tables  of  mean  square  errors  (MSEs)  for 
each  estimation  technique  used  in  the  research  effort,  given  a 
particular  shape  parameter  and  sample  size.  Since  MSE  is  the  evaluation 
tool  used  to  determine  which  estimation  technique  was  best,  these  tables 
were  used  to  make  the  estimator  comparisons.  The  estimator  with  the 
smallest  MSE  value  is  considered  the  best  estimator  of  those 


investigated . 


TABLE  II 

Mean  square  error  for  c  =  1  and  n  »  9 


LOCATION  (a) 


t  1  ! 

1 1  ESTIMATION  1 

1  1 

1  1 

1 1  ESTIMATION 

! 1  TECHNIQUE  ! 

!  1  ! 

MSE 

i 1  TECHNIQUE 

1  ! 

SCALE  <b) 


BLUE  1  2.14G2G25E-02  I 
AOMOl  !  2.30384G2E-02  ! 
CUMDl  i  2.4G78135E-02  ! 
KSMDl  !  4.04G1082E-02  I 
CUMD2  i  4.2301375E-02  I 
KSMD2  1  4.383GGG4E-02  1 
ADMD2  i  G.1G53G44E-02  ! 


0.40G2G9S 

0.5B38790 

0.57G0345 

0.5824831 

0.6407889 

0.GG8S479 

0.8271174 


Figure  2.  Sample  Table  of  Mean  Square  Errors 


Figure  2  shows  a  sample  of  the  table  format.  Each  table  contains 
two  sections.  The  left  section  contains  the  MSEs  based  upon  estimation 
of  the  location  parameter  while  the  right  section  contains  the  MSEs 
based  upon  estimation  of  the  scale  parameter.  This  format  permits  an 
easy  comparison  of  the  BLUE  MSE  value  for  each  parameter  with  the  MSE 
value  of  each  of  the  minimum  distance  estimation  techniques  used.  The 
smallest  MSE  value  in  each  table  section  then  reflects  the  best 
estimation  technique  to  use  for  that  parameter,  under  the  stated  shape 
parameter  and  sample  size  conditions.  To  further  simplify  the  reading 
of  the  tables,  the  estimation  techniques  are  ordered  in  each  table  from 
smallest  MSE  value  to  largest.  Therefore,  the  technique  which  generated 
the  smallest  MSE  result  is  listed  first  in  each  column  and  is  also  the 
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best  estimation  technique  to  use  for  estimating  that  parameter  under  the 
specified  conditions. 

Each  section  of  the  table  contains  a  list  of  the  estimation 
techniques  that  were  applied  to  the  1000  sample  sets  of  ordered  Pareto 
random  variates.  BLUE  refers  to  the  best  linear  unbiased  estimation 
technique.  Each  of  the  other  techniques  was  compared  against  this 
technique  to  determine  which  technique  provided  the  better  estimate. 
ADMDl  refers  to  the  Anderson-Darling  minimum  distance  estimatioii 
technique.  Additionally,  the  1  implies  that  only  one  parameter  uas 
permitted  to  vary  while  the  other  parameter  was  held  constant  (equal  to 
the  BLU  estimate).  For  example,  ADMDl  under  the  location  parameter 
section  of  the  table  implies  that  the  location  parameter  was  varied 
while  the  scale  parameter  was  held  equal  to  the  BLU  estimate  for  that 
sample  size  and  shape  parameter.  CUMDl  and  KSMDl  refer  to  the 
Cramer-von  Mises  and  the  Kolmogorov  minimum  distance  estimation 
techniques  respectively.  Again,  the  1  implies  that  only  one  parameter 
was  allowed  to  vary  in  finding  the  minimum  distance  value,  while  the 
other  parameter  was  held  equal  to  the  BLU  estimate.  ADMD2  again  refers 
to  the  use  of  the  Anderson-Darling  distance  measure  in  the  minimization 
process.  However,  in  this  case,  both  the  location  and  scale  parameters 
were  permitted  to  vary  simultaneously  in  determining  the  minimum 
distance  measure.  The  2  in  the  notation  indicates  that  two  parameters 
were  allowed  to  vary  during  the  minimization  process.  CUMD2  and  KSMD2 
refer  to  the  use  of  the  Cramei — von  Mises  and  Kolmogorov  distance 
measures  respectively.  Again,  the  2  in  the  notation  implies  that  two 
parameters  (i.e.,  location  and  scale)  were  permitted  to  vary  during  the 
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minimization  routine.  Therefore,  in  this  report,  ADMDl,  C'JMDl  and  KSMDl 
are  called  single  parameter  minimum  distance  estimation  techniques  while 
ADMD2,  CVMD2  and  KSMD2  are  called  double  parameter  minimum  distance 
estimation  techniques. 

ANALYSIS 

Regarding  the  location  parameter,  the  BLU  estimator  provided  the 
smallest  MSE  values  in  all  cases  except  the  case  where  the  shape 
parameter  equalled  1  <c  =  1)  and  the  sample  size  equalled  B  (n  »  B).  In 
this  case,  the  single  parameter  Anderson-Darling  minimum  distance 
estimator  (AOMOl)  provided  the  smallest  MSE.  Based  on  this  analysis, 
the  results  showed  that,  overall,  the  best  linear  unbiased  estimator 
performed  better  than  any  of  the  minimum  distance  estimators  evaluated. 

The  results  of  the  research  regarding  estimation  of  the  scale 
parameter  was  even  more  pronounced.  Regardless  of  the  shape  parameter 
(c  =  1,  2,  3  or  4)  or  sample  size  (n  =  B,  9,  12,  15,  or  18)  used  in  this 
study,  the  BLUE  provided  the  smallest  MSE  in  every  case  and  is  therefore 
ranked  as  the  best  of  the  estimation  techniques  investigated.  None  of 
the  minimum  distance  estimation  techniques  provided  better  MSE  values  in 
any  instance.  Therefore,  investigators  should  feel  comfortable  using 
the  BLUE  as  an  instrument  of  estimation  when  the  underlying  population 
distribution  is  the  Pareto. 

Additionally,  some  observations  were  made  regarding  the  minimum 
distance  estimation  techniques  that  were  applied  in  this  study  and  how 
they  performed  against  each  other.  Performance  of  the  minimum  distance 
estimators  on  both  location  and  scale  were  addressed. 
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For  the  location  parameter,  the  single  parameter  Anderson-Darling 
nininun  distance  estimator  (ADMDl)  provided  the  smallest  MSE  values  in 
every  case  among  the  minimum  distance  estimators  tested.  ADMDl  uas 
therefore  considered  the  best  minimum  distance  estimator  of  location 
among  those  investigated. 

Dne  concern  this  researcher  had  regarding  the  minimum  distance 
estimation  technique  uas  uhether  to  let  both  the  location  and  scale 
parameters  vary  (double  parameter  estimator)  to  achieve  the  minimum 
distance  measure  or  to  permit  only  one  of  the  parameters  to  vary  (single 
parameter  estimator)  uhile  holding  the  other  as  a  constant,  equal  to  the 
BLUE  for  that  parameter.  For  the  location  parameter,  the  results  show 
that  the  single  parameter  minimum  distance  estimator  out-performed  its 
double  parameter  counterpart  in  every  case  except  one.  When  c»l  and 
n=G  ,  KSMD2  provided  a  smaller  MSE  than  did  KSMDl.  In  all  other  cases, 
houever,  the  single  parameter  minimum  distance  estimator  provided  better 
results.  Therefore,  the  single  parameter  estimation  technique  performed 
better  than  its  double  parameter  counterpart  when  the  Kolmogorov, 

Cramei — von  Mises,  or  Anderson-Darling  distance  measure  uas  minimized  for 
location  parameter  estimation  of  the  Pareto. 

For  the  scale  parameter,  the  inferences  draun  required  a  bit  more 
scrutiny  as  there  uas  no  single  best  minimum  distance  estimator.  There 
uas  a  shift  in  performance  uhen  a  shape  parameter  c  >  1  uas  specified. 
For  c  “  1  ,  CUMDl  uas  the  best  overall  estimator,  since  it  provided  the 
smallest  MSE  in  four  of  the  five  cases  investigated.  The  exception 
occurred  again  in  the  case  c»l  and  n-G  ,  uhere  the  KSMDl  estimator 
gave  the  smallest  MSE;  houever,  CVMDl  did  provide  the  next  smallest  MSE. 
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Therefore,  C'JMDl  was  selected  as  the  best  nininun  distance  estimator  for 
scale  when  the  shape  was  specified  as  c  *  1  .  In  the  other  15  cases 
investigated,  KSIiD2  provided  the  smallest  MSE  values  in  12  instances. 

The  three  exceptions  were:  c”2  and  n^B  where  KSMDl  was  best,  c*2 
and  n=15  where  CVMDl  was  best,  and  c*3  and  n»15  where  ADMDl  was 
best.  Overall,  CUMDl  performed  best  for  c=l  and  KSMD2  performed  best 
for  c»2,  3  or  4  among  the  minimum  distance  estimators  for  scale 
investigated. 

Regarding  use  of  the  single  parameter  versus  the  double  parameter 
minimum  distance  estimation  technique  for  scale,  no  clear  rule  can  be 
stated,  although  there  was  a  definite  trend  shown  in  the  results.  For 
c=*l  ,  the  single  parameter  technique  clearly  dominated  since  in  all  but 
one  case,  the  single  parameter  estimator  provided  smaller  MSE  values 
than  the  corresponding  double  parameter  estimator.  The  only  exception 
was  for  c“l  and  n“12  where  KSMD2  performed  better  than  KSMDl 
(perhaps  an  indication  of  the  improved  performance  this  estimator  would 
show  for  larger  c  values).  However,  as  the  shape  parameter  value 
Increased  from  1  to  4,  the  performances  of  the  double  parameter 
techniques  improved.  In  fact  for  c*4  ,  the  double  parameter  estimation 
techniques  performed  better  than  their  single  parameter  counterparts  in 
all  but  one  case:  for  c=4  and  n»18,  ADMDl  out-performed  ADMD2. 
Therefore,  for  c=l  ,  the  single  parameter  minimum  distance  estimators 
performed  better  overall  than  their  double  parameter  counterparts.  For 
c=4  ,  the  reverse  was  true.  For  shape  values  of  2  and  3,  the 
performance  was  mixed,  but  the  trend  toward  improved  double  parame.er 
performance  with  the  increasing  value  of  shape  was  still  evident. 
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CONCLUSION 

The  observations  nade  regarding  the  nininun  distance  estimators, 
although  lengthy,  should  not  overshadow  the  primary  conclusion  drawn 
from  this  research.  The  best  linear  unbiased  estimators  provided  the 
best  estimates  of  both  location  and  scale  when  compared  with  any  of  the 
minimum  distance  estimators  based  upon  the  mean  square  error  criteria. 
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UI.  Sunwarv  and  Recownendat ions 

This  chapter  presents  a  summary  of  the  research  effort,  restating 
the  objective  of  the  study,  the  methodology  used,  and  the  major 
conclusions  drawn  from  the  experimental  results.  Further,  three 
recommendations  for  further  study  in  this  area  are  presented. 

SUMMARY 

The  purpose  of  this  research  was  to  compare  the  minimum  distance 
estimation  technique  with  the  best  linear  unbiased  estimation  technique 
to  determine  which  estimator  provided  more  accurate  estimates  of  the 
underlying  location  and  icale  parameter  values  for  a  given  three 
parameter  Pareto  distribution  with  specified  shape  parameter.  The 
Kolmogorov,  Cramer-von  Mises,  and  Anderson-Darling  distance  measures 
were  used  to  develop  the  minimum  distance  estimators.  For  each  of  these 
distance  measures,  two  minimum  distance  estimators  were  developed.  The 
first  minimum  distance  estimator  varied  only  a  single  parameter  value  to 
achieve  the  distance  measure  minimization.  This  estimator  was  called 
the  single  parameter  minimum  distance  estimator.  The  second  minimum 
distance  estimator  allowed  both  the  location  and  scale  parameters  to 
vary  while  achieving  the  minimum  distance  measure.  These  estimators 
were  called  the  double  parameter  minimum  distance  estimators.  These 
minimum  distance  estimators  were  compared  against  the  best  linear 
unbiased  estimators  which  had  been  previously  developed  by  Kulldorff  and 
Uannman  for  shape  greater  than  2,  and  by  Uannman  for  shape  equal  to  or 
less  than  2.  Manual  derivation  of  the  B  values  formula  for  the  special 
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case,  c=l  ,  revealed  an  error  in  the  published  version  of  the  formula, 
as  explained  in  Chapter  III.  of  this  report. 

A  Monte  Carlo  methodology  was  used  to  generate  the  estimates  for 
each  of  the  estimation  techniques  investigated.  A  sample  of  Pareto 
random  variates  uas  generated  from  a  completely  specified  three 
parameter  Pareto  distribution  uith  location  and  scale  equal  to  one  and 
the  shape  parameter  iteratively  specified  as  one,  two,  three,  or  four. 
The  estimates  of  location  and  scale  uere  then  generated  based  on  each  of 
the  estimation  techniques.  This  process  uas  repeated  1000  times  for 
each  combination  of  shape  parameter  (c  »  1,  2,  3,  or  4)  and  Pareto 
random  variate  sample  size  (n  »  6,  9,  12,  15,  or  18).  This  Monte  Carlo 
process  resulted  in  1000  estimates  of  both  location  and  scale  for  each 
estimation  technique  used. 

The  criteria  for  determining  which  estimation  technique  performed 
best  uas  based  on  the  resulting  mean  square  error  calculation  for  each 
group  of  1000  estimates.  The  estimation  technique  which  yielded  the 
smallest  mean  square  error  uas  selected  as  the  best  performing 
estimator. 

The  results  of  this  research  clearly  indicated  that  the  best  linear 
unbiased  estimator  provided  smaller  mean  square  er'or  terms  than  any  of 
the  minimum  distance  estimation  techniques  investigated.  Therefore,  the 
best  linear  unbiased  estimation  technique  uas  ranked  as  the  best 
estimation  technique  among  those  tested. 

Regarding  the  minimum  distance  estimators,  a  comparison  of  the 
single  versus  double  parameter  techniques  uas  made.  For  estimation  of 
the  location  parameter,  the  single  parameter  estimation  technique 
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performed  better  than  the  double  parameter  estimation  technique.  For 
the  scale  parameter  estimation,  the  conclusion  was  not  as  clear.  A 
trend  was  identified  as  the  value  of  the  specified  shape  parameter 
increased  from  1  to  4.  For  c  *  1  ,  the  single  parameter  estimators 
performed  better:  however,  as  the  shape  parameter  increased,  the 
performance  of  the  double  parameter  estimators  improved  until  at  c  “ 
the  double  parameter  estimators  performed  better  than  their  single 
parameter  counterparts. 

RECOMMENDATIONS 

Three  recommended  areas  for  further  study  in  this  research  area  are 
now  offered.  First,  a  study  similar  to  this  one  can  be  performed,  again 
based  upon  a  specified  three  parameter  Pareto  distribution,  but  using 
minimum  distance  estimators  based  on  different  distance  measures. 
Examples  of  such  distance  measures  include  the  Kuiper  distance  and  the 
Watson  distance  referenced  by  M.  A.  Stephens  (35=731).  Second,  a  study 
involving  a  comparison  of  a  set  of  minimum  distance  estimators  against 
the  best  linear  unbiased  estimators  based  on  the  more  commonly  used  two 
parameter  form  of  the  Pareto  distribution  could  prove  fruitful.  Third, 
a  researcher  could  perform  a  comparison  study  involving  the  maximum 
likelihood  estimator  and  a  set  of  minimum  distance  estimators,  again 
based  upon  the  common  two  parameter  form  of  the  Pareto  distribution 
function.  Any  of  these  areas  would  provide  fertile  ground  for  the 
investigative  statistical  researcher. 


ftooendix  A 


Tables  of  Mean  Square  Errors 


The  following  notation  is  used  in  this  appendix: 


Term 


Notat ion 


Best  Linear  Unbiased  Estimator 

BLUE 

Anderson-Darling  Minimum  Distance 
(Only  one  varying  parameter) 

Estimator 

ADMDl 

Cramei — von  Mises  Minimum  Distance 
(Only  one  varying  parameter) 

Estimator 

CUMDl 

Kolmogorov  Minimum  Distance  Estimator 
(Only  one  varying  parameter) 

KSMDl 

Anderson-Darling  Minimum  Distance 
(Two  varying  parameters) 

Est imator 

ADMD2 

Cramer — von  Mises  Minimum  Distance 
(Two  varying  parameters) 

Estimator 

CUMD2 

Kolmogorov  Minimum  Distance  Estimator 
(Two  varying  parameters) 

KSMD2 

Location  Parameter 

a 

Scale  Parameter 

b 

Shape  Parameter 

c 

Sample  Size 

n 

Mean  Square  Error 

MSE 

The  Monte  Carlo  analysis  involves  1000  iterations  for  the 
generation  of  each  table.  The  true  value  of  the  location  parameter  is 
one  and  the  true  value  of  the  scale  parameter  is  one  for  all  of  the 
tables . 
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TABLE  I 


Mean  square  error  for  c  =•  1  and  n  =  6 


LOCATION  (a) 


SCALE  (b) 


1  1 

1  1 

1 !  ESTIMATION 

1  1 

1  1 

! !  ESTIMATION 

1  1 

1  1 

1  1 

1  1 

! 1  TECHNIQUE 

t  1 

1  1 

MSE 

i !  TECHNIQUE 

t  1 

1  1 

MSE  1  1 

1  1 

ADMDl  I  G.19B0317E-02  !i  BLUE 

CUMDl  I  6.4B73184E-02  !!  KSMDl 

BLUE  !  S.788B792E-02  !!  CUMDl 

CUMD2  1  9,81GB950E-02  !l  ADMDl 

KSMD2  i  1.192953  E-01  !l  KSMD2 

KSMDl  I  1.2137F4  E-01  !!  CUMD2 

ADMD2  i  4,874095  E-01  !i  ADMD2 


0.9352127 

1.4B0597 

1.52394B 

1.5890G1 

1.G5752B 

1.BB7902 

3.G35ii43 
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TABLE  III 


Mean  square  error  for  c  *  1  and  n  *  12 


LOCAT 

.ON  (a) 

SCALE  (b) 

t 

1 

1 

1 

ESTIMATION 

ESTIMATION 

1 

1 

i 

1 

TECHNIQUE 

MSE 

TECHNIQUE 

MSE 

1 

1 

1 

BLUE 

1.0628480E-02 

BLUE 

0.3G27G00 

1 

1 

1 

1 

ADMDl 

1.1G31799E-02 

CVMDl 

0.4933714 

1 

1 

CVMDl 

1.3175875E-02 

ADMDl 

0.5042G03 

1 

1 

CUMD2 

2.24S5G85E-02 

KSMD2 

0.513G845 

1 

1 

1 

KSMDl 

2.3550959E-02 

KSMDl 

0.5234504 

1 

1 

1 

t 

KSMD2 

2.474484GE-02 

1  CVMD2 

0.5422385 

1 

1 

1 

1 

1 

ADMD2 

3.2977GS2E-02 

!  ADMD2 

1 

1 

0.G380025 

1 

1 

1 

1 

TABLE  IV 

Mean  square  error  for  c  =  1  and  n  =  15 


(  1  1 

1  1  1 

1 1  ESTIMATION  1 

1  1  1 

1  1  1 

! !  ESTIMATION  1 

1 1  TECHNIQUE  I 

1  i  1 

•  1  < 

MSE 

1 !  TECHNIQUE  1 

1  1  1 

1  1  t 

MSE 

LOCATION  (a^ 


SCALE  (b) 


BLUE 

7.4139438E-03 

ADMDl 

9.5G94885E-03 

CVMDl 

1.1153221E-02 

KSMDl 

1.G373332E-02 

2.43382GGE-02 

2.5020871E-02 

2.9120248E-32 

BLUE 

0.3053709 

CVMDl 

0.37541G5 

ADMDl 

0.3911322 

KSMDl 

0.415G374 

0.5204881 

0.5390435 

ADMD2 

0.G280289 

TABLE  \) 


Mean  square  error  for  c  »  1  and  n  =  18 


1 

1 

I  LOCATION  (a) 

1 

1 

1 

1 

SCALE  ( b  > 

1 

1 

1 

1 

1  ESTIMATION 

ESTIMATION 

1 

1 

1 

1  TECHNIQUE 

1 

1 

MSE 

TECHNIQUE 

MSE  ! 

1 

1 

1 

1 

!  BLUE 

4.3005478E-03 

BLUE 

i 

1 

0.1737252  i 

1  ADMDl 

5.2362322E-03 

CUMDl 

0.2037124  ! 

1  CUMDl 

6.4189113E-03 

KSMDl 

0.2041911  1 

i  ADMD2 

8.G196102E-03 

ADMDl 

0.2043972  1 

!  KSMDl 

1.0911038E-02 

KSMD2 

0.2185375  1 

!  C'JMD2 

1.12SS2S9E-02 

CUMD2 

0.2299257  1 

i  KSMD2 

1 

1 

1.290G388E-02 

ADMD2 

0.23G2998  1 

i 

1 

TABLE  VI 


Mean  square  error  for  c  =  2  and  n  =  G 


1 

1 

1 

1 

1 

1 

LOCATION  (a) 

SCALE  (b) 

1 

1 

1 

1 

1 

1 

i 

1 

1 

ESTIMATION 

ESTIMATION 

1 

1 

1 

1 

t 

1 

1 

1 

TECHNIQUE 

MSE 

TECHNIQUE 

MSE 

1 

1 

1 

t 

1 

1 

1 

I 

BLUE 

1.1706045E-02 

BLUE 

0.5737305 

1 

i 

1 

1 

1 

1 

ADMDl 

1.2802768E-02 

KSMDl 

0.7080300 

1 

1 

1 

1 

CUMDl 

1.3G87569E-02 

CUMDl 

0.7609550 

1 

1 

1 

KSMDl 

1.5005208E-02 

CUHD2 

0.7871752 

1 

1 

1 

1 

CUMD2 

1.7071828E-02 

ADMDl 

0.8051341 

1 

1 

1 1  KSMD2 

2.2394231E-02 

KSMD2 

0.9329775 

1 

1 

1 i  ADMD2 

1  1 

1  1 

2.8150991E-02 

ADM02 

1.151503 

1 

1 

I 

G2 


TABLE  VII 


Mean  square  error  for  c  ■  2  and  n  =•  9 


LOCATION  <a) 


SCALE  (b) 


'IMATION 

IHNIQUE 

1 

1 

1 

1 

MSE  { 

1 

ESTIMATION 

TECHNIQUE 

MSE 

BLUE 

1 

1 

4.B05573BE-03  ! 

BLUE 

0.25B4761 

ADMDl 

5.1687085E-03  ! 

KSMD2 

0.2989004 

CVMDI 

5.9630712E-03  ! 

CVMDI 

0.31370B0 

CVMD2 

6.9B29B52E-03  I 

KSMDl 

0.31B9191 

KSMDl 

B.9843503E-03  1 

CVMD2 

0.3197137 

KSMD2 

7.83B7852E-03  ! 

ADMDl 

0.3233747 

ADMD2 

8.9378590E-03  1 

1 

t 

ADMD2 

0.4040874 

TABLE  VIII 


Mean  square  error  for  c  =  2  and  n  =  12 


LOCATION  (a) 


ESTIMATION 

TECHNIQUE 


MSE 


ESTIMATION 

TECHNIQUE 


BLUE  I  l.g467037E-03  !1  BLUE 

ADMDl  !  2.2874754E-03  !!  KSMD2 

CVMDi  1  2.9733009E-03  I!  ADMDl 

ADMD2  1  3.027B354E-03  !!  CVMDI 

KSMDl  !  3.49144eiE-03  !!  KSMDl 

CVMD2  !  3.77B4474E-03  1!  CVMD2 

KSMD2  I  4.B89922BE-03  1!  ADMD2 


SCALE  (b) 


MSE 


0.199B142 

0.219B842 

0.2351739 

0.238B533 

0.239323B 

0.2409351 

0.2438B05 


B3 


TABLE  IX 


ESTIMATION 

TECHNIQUE 


ESTIMATION 

TECHNIQUE 


0.1B82298 

0.197G805 

0.1981037 

0.2068403 

0.2084063 

0.2095122 

0.2135706 


TABLE  X 


Mean  square  error  for  c  =  2  and  n  =  18 


ESTIMATION 

TECHNIQUE 


BLUE 

ADMDl 

ADMD2 

CUMDl 

KSMDl 

CUMD2 

KSMD2 


7.71G7596E-04 

9.3843893E-04 

1.3170G47E-03 

1.3989438E-03 

1.G1S9606E-03 

1.8582337E-03 

2.3905281E-03 


ESTIMATION 

TECHNIQUE 


BLUE 

KSMD2 

ADMDl 

CUMDl 

CUMD2 

KSMDl 

ADMD2 


0. 1260424 
0.1401054 
0.1430691 
0.1444666 
0.1470691 
0.1481810 
0.1504120 
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TABLE  XI 


Mean  square  error  for  c  *  3  and  n  »  B 


1 

1 

1 

i 

1 

1 

LOCATION  (a) 

SCALE  (b> 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

ESTIMATION 

ESTIMATION 

1 

t 

1 

t 

1 

1 

1 

1 

TECHNIQUE 

MSE 

TECHNIQUE 

MSE 

1 

1 

1 

1 

1 

i 

1 

BLUE 

4.G1G252GE-03 

BLUE 

0.3430938 

1 

1 

1 

t 

1 

1 

ADMDl 

4.741732SE-03 

KSMD2 

0.3799514 

t 

1 

1 

1 

CUMDl 

4.99SG84GE-03 

KSMDl 

0.42B1144 

1 

1 

1 

1 

CUMD2 

5.593G114E-03 

CUMD2 

0.4292807 

t 

1 

1 

1 

KSMDl 

5.G171883E-03 

CUMDl 

0.45212G3 

1 

1 

1 

1 

KSMD2 

G.154517GE-03 

ADMDl 

0.4B34998 

1 

t 

1 

1 

1 

ADMD2 

1.7377743E-02 

ADMD2 

0.7G02041 

1 

1 

1 

t 

TABLE  XII 


Mean  square  error  for  c  =  3  and  n  =  9 


1 


LOCATION  (a) 


ESTIMATION 

TECHNIQUE 


MSE 


ESTIMATION 

TECHNIQUE 


BLUE  I  2.55O4350E-03  ! I  BLUE 
ADMDl  !  2.9328801E-03  !!  KSMD2 
CUMDl  !  3.2298244E-03  !!  CUMD2 
KSMDl  !  3.4083289E-03  !!  CUMDl 
CUMD2  I  3.4843122E-03  !!  KSMDl 
ADMD2  !  3.G2043S0E-03  !l  ADMDl 
KSMD2  !  3.8G82341E-03  !!  ADMD2 


SCALE  (b) 


MSE 


0.21730G2 

0.2343150 

0.2G13097 

0.2G33551 

0.2G47G87 

0.2G6G04G 

0.2911893 


G5 


TABLE  XIII 


Hean  square  error  for 


LOCATION  (a) 


ESTIMATION 

TECHNIQUE 


BLUE 

ADMDl 

ADMD2 

CUMDl 

CUMD2 

KSMDl 

KSMD2 


1 . 1032750E-03 
1. 32447 17E-03 
1.S484B67E-03 
1.G325B71E-03 
1.8109155E-03 
1.824576GE-03 
2.1554409E-03 


SCALE  <b) 


ESTIMATION 

TECHNIQUE 


1 

!  BLUE 

1 

0.1465466  1 

!  KSMD2 

0.1634654  1 

!  CUMD2 

0.1717137  1 

!  ADMDl 

0.1786888  1 

1  CUMDl 

0.1787402  ! 

!  ADMD2 

0.1813732  ! 

!  KSMDl 

1 

1 

0.18S1933  i 

1 

1 

TABLE  XIU 


Mean  square  error  for  c  =  3  and  n  =  15 


LOCATION  <a) 


ESTIMATION 

TECHNIQUE 


BLUE 

ADMDl 

ADMD2 

CUMDl 

KSMDl 

CUMD2 

KSMD2 


G.199G793E-04 

8.2581321E-04 

1.050S152E-03 

1.0802834E-03 

1.2737850E-03 

1.29G67G2E-03 

1.5728395E-03 


SCALE  (b) 


ESTIMATION 

TECHNIQUE 


BLUE 

ADMDl 

CUMDl 

KSMD2 

KSMDl 

CUMD2 

ADMD2 


0.1280874 

0.1454G54 

0.1467468 

0.1484211 

0.1S1952G 

0.1521726 

0.1580111 
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TABLE  XU 


Mean  square  error  for  c  *  3  and  n  “  18 


I 

1 

1 

LOCATION  (a)  1 

1 

SCALE  (b) 

1 

t 

i 

1 

1 

ESTIMATION 

ESTIMATION 

1 

t 

1 

TECHNIQUE 

MSE 

TECHNIQUE 

MSE 

t 

1 

1 

1 

BLUE 

3.S388739E-04 

BLUE 

0.1050092 

1 

1 

1 

i 

ADMDl 

4.727S92GE-04 

KSMD2 

0.118329G 

1 

1 

ADMD2 

5.8G12920E-04 

ADMDl 

0.1190109 

1 

CUMDl 

G.7394995E-04 

CUMDl 

0.1217095 

1 

1 

KSMDl 

7.913923SE-04 

CUMD2 

0.1237090 

1 

1 

CUMD2 

8.100195GE-04 

KSMDl 

0.1239811 

t 

1 

1 

KSMD2 

1.0424773E-03 

ADMD2 

0.1250392 

1 

1 

1 

1 

TABLE  XU  I 


Mean  square  error  for  c  =  4  and  n  =  B 


LOCATION 


ESTIMATION 

TECHNIQUE 


(a) 


MSE 


ESTIMATION 

TECHNIQUE 


BLUE  1  2.2004284E-03  II  BLUE 
ADMDl  I  2.3G99524E-03  II  KSMD2 
CUMDi  I  2.567i8i6E-03  II  CUMD2 
CUMD2  I  2,G604387E-03  II  KSMDl 
KSMDl  1  2.7944313E-03  II  CUMDl 
ADMD2  I  3.01733G1E-03  II  ADMD2 
KSMD2  I  3,lS44713E-03  II  ADMDl 


I 


SCALE  (b) 


MSE 


0.2905423 
0.31G99G8 
0 . 3370034 
0.3809848 
0.383G123 
0,3902711 
0.3999490 


G7 


TABLE  XUII 


Mean  square  error  for  c  =  4  and  n  -  9 


1 

1 

1 

( 

I 

1 

LOCATION  (a) 

SCALE  (b) 

1 

1 

I 

1 

1 

1 

1 

1 

1 

I 

ESTIMATION 

ESTIMATION 

t 

1 

! 

1 

1 

1 

1 

TECHNIQUE 

MSE 

TECHNIQUE 

MSE 

1 

1 

1 

1 

1 

1 

1 

1 

BLUE 

9.8934094E-04 

BL  JE 

0,190878G 

t 

1 

1 

1 

t 

1 

ADMDl 

1.1494281E-03 

KSMD2 

0.2049502 

t 

1 

1 

CUMDl 

1.3984025E-03 

1  CUMD2 

0.2244719 

1 

1 

ADMD2 

1.4030078E-03 

ADMD2 

0.2334024 

1 

1 

CUMD2 

1.48S70G0E-03 

ADMDl 

0.2394877 

1 

1 

KSMDl 

1.5590133E-03 

CUMDl 

0.2420758 

1 

1 

1 

1 

1 

KSMD2 

1.77S2400E-03 

_ 

KSMDl 

0.24G007S 

1 

1 

TABLE  XVI II 


Mean  square  error  for  0*4  and  n  =  12 


LOCATION 


ESTIMATION 

TECHNIQUE 


(a) 


MSE 


ESTIMATION 

TECHNIQUE 


BLUE  1  4.3847732E-04  !!  BLUE 

ADMDl  !  5,3575G70E-04  !!  KSMD2 

ADMD2  !  6.43269G8E-04  !!  ADMD2 

CVMDI  I  7.0931221E-04  !l  CUMD2 

CUMD2  1  8.039496GE-04  :i  ADMDl 

KSMDl  I  8.4389572E-04  !!  CVMDl 

KSMD2  I  9.792GG83E-04  I!  KSMDl 


SCALE  (b) 


MSE 


0.14G424G 

0.1G0G177 

0.1745982 

0.1752579 

0.1773998 

0.184G511 

0.1893843 


G8 


TABLE  XIX 


Mean  square  error  for  c  *  4  and  n  =  15 


LOCATION  (a) 


SCALE  (b) 


ESTIMATION 

TECHNIQUE 


i  BLUE 

i  ADMDl 

1  ADMD2 

I  CUMDl 

i  CUMD2 

I  KSMDl 

!  KSMD2 


MSE 


ESTIMATION 

TECHNIQUE 


MSE 


2.8805208E-04  il  BLUE 

3.748S354E-04  !!  KSMD2 

4.3274325E-04  !!  CUMD2 

5.3938437E-04  I!  ADMD2 

B.0927385E-04  I!  ADMDl 

G.2738883E-04  I!  CUMDl 

7.41B04G6E-04  !!  KSMDl 


0.12107G2 

0.1318542 

0.13939GG 

0.1417205 

0.1437G90 

0.1477181 

0.1574258 


TABLE  XX 

Mean  square  error  for  c  =  4  and  n  »  18 


1 

1 

I  LOCATION  <a) 

1 

1 

SCALE  (b) 

1 

1 

1  ESTIMATION 

ESTIMATION 

!  TECHNIQUE 

1 

1 

MSE 

TECHNIQUE 

MSE 

1 

1 

1 

1  BLUE 

1.9949843E-04 

BLUE 

9.3522320E-02 

1  ADMDl 

2.G885197E-04 

KSMD2 

1.0103040E-01 

1  ADMD2 

2.9941079E-04 

ADMDl 

1.070G935E-01 

i  CUMDl 

4.2770977E-04 

ADMD2 

1.0771950E-01 

1  CUMD2 

4.51769G1E-04 

CUMD2 

1.08G1484E-01 

!  KSMDl 

5.0622342E-04 

CUMDl 

1.1239370E-01 

I  KSMD2 

5.9582230E-04 

KSMDl 

1.1357192E-01 

1 

1 

1 
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The  following  FORTRAN  computer  progrom,  BLUMD.FOR,  was  written  to 
perform  the  Monte  Carlo  analysis  and  to  generate  the  mean  square  errors 
for  each  estimation  technique  investigated.  Program  documentation  is 
included  within  the  program  as  comment  statements  to  inform  the  reader 
of  the  purpose  of  each  statement  or  group  of  statements.  Additionally, 
each  subroutine  is  prefaced  by  extensive  documentation  to  inform  the 
reader  of  the  purpose  of  the  subroutine,  all  of  the  variables  used  in 
the  subroutine,  the  input  variables  required,  the  output  variables 
generated,  and  the  major  computations  performed  within  the  subroutine  to 
obtain  the  desired  outputs. 
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ELUMD  (BLU/MINIMUM  DISTANCE)  MAIN  PR06RAM 


c  Purpose-  BLUMD  calculates  the  best  linear  unbiased  estimates  in 
c  addition  to  the  six  minimum  distance  estimates 

c  (based  on  the  Kolmogorov,  the  Cramer- von  Mises,  and  the 

c  Anderson-Darling  distances)  for  both  the  location  and 

c  scale  parameters  of  the  three  parameter  Pareto 

c  distribution,  uhere  the  shape  parameter  is  varied 

c  betueen  the  integer  values  1,Z,3  and  4.  Sample  sizes 

c  of  B,  9,  12,  15,  and  18  are  used.  Pareto  variates  are 

c  generated  for  each  combination  of  shape  parameter  and 

c  sample  size.  Finally,  BLUMD  calculates  the  mean  square 

c  error  for  each  estimate  type  to  compare  uhich 

c  estimation  technique  performs  best. 


c  Var 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 


lables-  n  = 
c  = 
nn  = 

kkk  = 

dseed  = 

X  = 

B  = 

BB  = 

D  = 

Anc  = 

Bnc  = 

ablu  = 
bblu  = 
aKS  = 
bKS  = 
aCVM  = 
bCVM  = 
aAD  = 

bAD  = 

aZAD  = 
bZAD  = 
alCV  = 

blCV  = 

alKS  = 

blKS  = 


sample  size 
shape  parameter 

sample  size  symbol  (varies  from  1-5 
representing  each  permissable  sample  size) 
dummy  variable  used  to  convert  nn  to  n  by 
using  the  formula:  kkk  =  3  +  (nn*3) 
double  precision  seed  for  the  Pareto  variates 
array  of  Pareto  variates 

array  of  B  values  used  to  calculate  the  blues 
for  shape  greater  than  Z 

array  of  BB  values  used  to  calculate  the  blues 
for  shape  less  than  or  equal  to  Z 
constant  used  to  calculate  the  blues  for  shape 
greater  than  Z 

constant  used  to  calculate  the  blues  for  shape 
less  than  or  equal  to  Z 

constant  used  to  calculate  the  blues  for  shape 
less  than  or  equal  to  Z 
blu  estimate  of  location,  a 
blu  estimate  of  scale,  b 

Kolmogorov  minimum  distance  estimate  for  a 
Kolmogorov  minimum  distance  estimate  for  b 
Cramei — von  Mises  min  distance  estimate  for  a 
Cramer-von  Mises  min  distance  estimate  for  b 
Anderson-Darling  min  distance  estimate  for  a 
uhile  holding  b  =  bblu  as  constant 
Anderson-Darling  min  distance  estimate  for  b 
uhile  holding  a  =  ablu  as  constant 
Anderson-Darling  min  distance  estimate  for  a 
Anderson-Darling  min  distance  estimate  for  b 
Cramer-von  Mises  min  distance  estimate  for  a 
uhile  holding  b  =  bblu  as  constant 
Cramer-von  Mises  min  distance  estimate  for  b 
uhile  holding  a  =  ablu  as  constant 
Kolmogorov  minimum  distance  estimate  for  a 
uhile  holding  b  =  bblu  as  constant 
Kolmogorov  minimum  distance  estimate  for  b 
uhile  holding  a  =  ablu  as  constant 
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c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

G 

c 

c 

c 

c 

c 

c 

c 

c 

c - 

c  Inputs: 
c 
c 
c 

c - 

C  Outputs: 

c 

c 

c - 


sse 


nse  - 


count  = 
anda  = 
esta  = 
lent  = 
andb  = 
estb  = 
xentb  = 
andab  = 
estaa  = 
estbb  = 
ientab  = 


array  of  sun  of  squared  errors  for  each 
estimation  technique  uhere  the  true  value  of 
a  IS  1  and  the  true  value  of  b  is  1. 
array  of  mean  square  errors  for  each 
estimation  technique  used 

array  of  counters  used  to  count  the  number  of 

valid  estimate  values  found 

array  of  calculated  fl-D  distance  measures 

uhen  estimating  location  alone 

array  of  location  estimates  used  to  minimize 

the  fl-D  distance 

counter  for  the  number  of  location  estimates 

used  to  minimize  the  ft-D  distance 

array  of  calculated  fl-D  distance  measures 

uhen  estimating  scale  alone 

array  of  scale  estimates  used  to  minimize 

the  fl-D  distance 

counter  for  the  number  of  scale  estimates 
used  to  minimize  the  fl-D  distance 
array  of  calculated  fl-D  distance  measures 
uhen  estimating  a  and  b  simultaneously 
array  of  location  estimates  used  to  minimize 
the  fl-D  distance 

array  of  scale  estimates  used  to  minimize 
the  fl-D  distance 

counter  for  the  number  of  location  and  scale 
estimates  used  to  minimize  the  fl-D  distance 


dseed  =  double  precision  seed  for  Pareto  variate 
generation 
c  =  shape  parameter 
n  =  sample  size 

mse  =  array  of  mean  square  errors  for  each 

estimation  technique  for  each  parameter  under 
investigation  (location  and  scale) 


c  Calculate:  mse  =  sse/number  of  trials 


c  ***  Variable  Declc  ations 

common  n,x,c,ablu,bblu, dseed , B , D, flne , Bnc , BB , aK S , bK S , 

1  aCVM, bCVM, aflD, bflD, nn, count , aZflD, bZflD, anda, esta, icnt 

1  , andb, estb, icntb, andab, estaa, estbb , ientab 

1  ,alCV,blCV,alKS,blKS 

integer  n, nn, count ( 4,5, 14) , c,kkk , lent , icntb, ientab 
real  x(  18) , ablu ,bblu ,B( 18) , D, flne, Bnc, BB(  18) , aKS, bKS, 

1  aCVM, bCVM, aflD, bflD, sse( 4,5, 14) ,mse( 4, 5, 14) , aZflD, bZflO 

1  , anda(  500 ) , andb(  500 ) , andab<  500) , esta( 500) , estb( 500) , 

1  estaa(  500 ) , estbb( 500 ) , aiCV, blCV , alKS, blKS 

double  precision  dseed 
call  uerset( 0, levold) 
dseed  =  4384GZ19Z17. d00 
print» ,’ dseed  =  ’, dseed 
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do  90  1=1,4 
c  =  1 
nn  =  0 

do  80  j=G,18,3 
n  =  j 

nn  =  nn  +  1 
do  40  j  3  j=l , 14 

S5e(  c, nn, j  j  3  )  =  0 

40  continue 

do  70  it  =  1,1000 

if  <<it.eq.200)  .or.  (it.eq.400)  .or.  (it.eq.600)  .or. 

1  (it.eq.800)  .or.  ( it . eq. 1000) )  then 

print*, ’c=’ ,c,  ’  11=’, n,’  iteration=’ ,  it 

end  if 
call  PARVflR 

if  (c  .gt.  2)  then 
call  BCGT2 
call  9LCGT2 
go  to  45 
end  1 ' 
call  BCLE2 
call  BLCLE2 

45  if  ( ablu  .eq.  0  .and.  bblu  .eq.  0)  then 

go  to  70 
end  if 
call  KSMD 
call  CVMMD 
call  RDMD 
call  RDBMD 
call  ftD2MD 
call  CVflhD 
call  CVBMD 
call  KSRMD 
call  KSBMD 

c  •**  Calculate  the  Sun  of  Squared  Errors 

58  sse(c,nn,l)  =  sse(c,nn,l)  +  (ablu  -  1 )**2 

sse(c,nn,2)  =  55e(c,nn,2)  +  (bblu  -  l)**2 

S5e(c,nn,3)  =  S5e(c,nn,3)  +  (aKS  -  1)**2 

5se(c,nn,4)  =  sse(c,nn,4)  +  (oKS  -  1)**2 

55e(c,nn,5)  =  sse(c,nn,5)  +  ( aCVM  -  1)**2 

sse(c,nn,B)  =  sse(c,nn,6)  +  ( bCVM  -  l)**2 

sse(c,nn,7)  =  sse(c,nn,7)  +  ( aflD  -  1)**2 

sse(c,nn,8)  =  sse(c,nn,8)  +  ( bftD  -  1)**2 

sse(c,nn,9)  =  sse(c,nn,9)  +  ( a2flD  -  1 )**2 

5se(c,nn,10)  =  sse(c,nn,10)  +  (b2ftD  -  1)**2 

sse(c,nn,ll)  =  5se(c,nn,ll)  +  (alCV  -  1)**2 

sse(c,nn,12)  =  sse(c,nn,12)  +  (blCV  -  1)**2 

S5e(c,nn,i3)  =  sse(c,nn,13)  +  (alKS  -  1 )**2 

sse(c,nn,i4)  =  sse(c,nn,14)  +  (blKS  -  1)**2 

if  (it  .eq.  1000)  then 

c  »**  Calculate  the  nean  square  error  for  each  estimate  type 

do  60  11  =  1,14 
kkk  =  3  +  (nn*3) 
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if  couni;{  c, nn,  11 )  .eq,  0)  then 
print*, ’count=0  for  c=’,c,’  n= ’ , kkk  ,  ’  e5t  =  ’ ,  1 1 
go  to  G0 
end  if 

nse(c,nn,ll)  =  sse( c , nn, 1 1 )/count( c, nn, 11 ) 
print* ,’ rnse=’ ,  ,’ise(  c,  nn,  11  >, '  c=’,c,’  n=’,kkk,’  est=',ll 

print* count  =  ’, count! c, nn, 1 1 ) 

G0  continue 

end  if 

'*0  continue 

80  continue 

90  continue 

end 

Subroutine  PGRVAR 


c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 


Pi rpose- 

For  a  specified  sample  size,  n,  PftRV8R 
generates  n  random  variates  from  a  Pareto 
distribution  uith  location  and  scale  parameters 
set  equal  to  one  and  the  shape  parameter,  c, 
set  "o  either  1,1,3  or  4. 

Formula- 

! 1/r )**! 1/c) 

Variables- 

r  = 

X  = 

n  = 

dseed  = 

random  number 

shape  parameter 

array  of  Pareto  variates 

sample  size 

random  number  seed 

Inpiuts  - 

dseed  = 

c  = 

n  = 

random  number  seed 
shape  parameter 
sample  size 

Outputs- 

r^  = 

rrray  of  Pareto  random  variates 

Calculate 

x!  j  )  = 

!  1/r! j  ) )  **  ! 1/c) 

Variable 

Declarations 

real  r< 18) , x( 18) , ahlu ,bblu ,B( 13)  D, Rnc, Bnc, BB( 18 ) , aKS, bKS, 

1  aCVH,bCVM,dflO.;>RD,aZ..O,bZAD 

integer  n, c, nn, count! 4,5, 14 ) 

connon  n, x ,  c,  ablu ,  bblu  ,d5eed, B,rj, hnc, Bnc, BB,  aKS,  bKS, 

1  aCVtl,  bCVM,  aflD,bAD,nn,coun-. ,  aZftl?,  bZfiO 

double  precision  dseed 
do  10  j=l,n 

c  ***  Call  IM5L  random  number  generator  subroutine  ggubs 

call  ggubs! dseed,n,r) 

c  ***  Use  the  inverse  transform  technique  for  Pareto  var lates 

x!j)  =  (l/r!j))**!  1/real!  c) ) 

10  continue 
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»»*  Call  VSRTA  to  sort  the  variates  ip  ascending  order 
call  vsrta(x,n) 
return 
end 

Subroutine  BCLEZ 

2 - 

c  Purpose:  For  a  given  sample  size,  n,  and  a  specified  shape 
c  ( c  1  or  c=2);  BCLEZ  calculates  the  B  values  used 

c  to  find  the  blu  estimates  of  location  and  scale, 

c  In  addition,  it  calculates  the  constants  fine  and  Bnc 

c  for  the  given  shape  and  sample  size. 

c - 

c  Variables:  c  =  shape  parameter 

c  n  =  sample  size 

c  BB  =  array  of  B  values  ( k  in  number) 

c  k  =  number  of  order  statistics  used;  k=n-[Z/cl 

r  nc  =  product  of  n  and  c 

c  fine  =  constant  in  t^o  formula  for  the  blu  for  scale 

c  Bnc  =  constant  in  the  formula  for  the  blu  for  scale 

c - - 

c  Inputs-  c  =  shape  parameter 

c  n  =  sample  size 

- 

c  Outputs:  BB  =  array  of  B  values 

c  fine  =  constant  in  the  blue  for  the  scale  parameter 

c  Bnc  =  constant  in  the  blue  for  the  scale  parameter 

2 - - - 

c  Calculate: 

c  fine  =  ( c+1 )< c+Z )( nc-1 )  /  ( nc-Z ) ( nc-c-Z ) 

c 

c  Bnc  =  (nc-Z)  /  (c+Z) 

c 

c  For  c=l  :  B( 1 )  =  (1  -  i/n)  [  1  -  i/(n-l)  1 

c 

c  For  c-Z  :  B( 1 )  =  1  -  i/n 

c 

c - 

c  ***  Variable  Declarations: 

real  nc,  x(  18 ) ,  ablu ,  bblu ,  B(  18 ) ,  D, fine,  Bnc,  BB(  18 ) ,  aKS,bKS, 

1  aCVM,bCVM,aflD,bflD,a2flD,bZflD 

integer  n, k ,c, nn,count( 4, 5, 14 ) 
double  precision  dseed 

common  n, x , c, ablu , bblu , dseed, B, 0, fine, Bnc. BB, aKS , bKS, 

1  aCVM,  bCVM,aflD,bflD,  nn,  count ,  aZftD,  bZflO 

k=n-( Z/c) 

c  *♦*  Calculate  the  B  values  uiien  c=l 

1  f  (  c.  eg.  1 )  t.ien 
do  10  3=l,k 

BB( j )=(  l-j/real( n) )»' l-j/( real( n)-l ) ) 

10  cont'nue 

go  I .  3G 
end  if 

c  *•*  Calculate  the  B  values  when  c=Z  (i.e.,  c.ne.l) 
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aLVM, bCVM, aflD, bflD, nn, count , aZAD, bZflD 
Bxsun=0 
k=n-(Z/c) 

c  ***  Sun  the  products  of  B( i >  and  x(i)  for  i  =  l,Z,...,k 

do  10  j=l,k 

Bxsun=Bxsun+BB<  j  )*x<  j ) 

10  continue 

nc=n*c 

c  ***  Calculate  the  blue  for  scale,  then  for  location 

bblu=Anc*( Bxsun  -  Bnc*x(l)) 
ablu=x( 1 )-bblu/( nc-i ) 

c  Increment  counter  for  valid  blues 

if  (bblu  .gt.  0)  then 

count! c, nn, 1 )  =  count! c, nn, 1 )  +  1 
count! c,nn,Z >  =  count! c, nn, Z )  +  1 
else 

print*, ’bblu=’ ,bblu, ’ablu=’ ,ablu, ’  negativity’ 
ablu  =  0 
bblu  =  0 
end  if 
return 
end 

Subroutine  BCGTZ 

c - 

c  Purpose:  For  a  given  sample  size,  n,  and  a  specified  shape, 
c  c>Z;  BCGTZ  calculates  the  B  and  D  values  used  to 

c  find  the  blu  estimates  of  location  and  scale. 

2 - 

c  Variables:  c  =  shape  parameter 

c  n  =  sample  size 

c  B  =  array  of  B  values  !n  in  size) 

c  D  =  D  value 

c  bsun  =  sum  of  B  values  for  i=l  .  .  .  !n-l> 

c - 

c  Inputs:  c  =  shape  parameter 

c  n  =  sample  size 

c - 

c  Outputs-  B  =  array  of  B  values 
c  0  =  D  value 

c - 

c  Calculate: 

c  B!i)  =  [1  -  Z/c!n-i  +  l>)  *  B!i-l) 

c 

c  D  =  !c+l)[B!l)  +  B!Z)  +  ...  +  B!n-1))  +  !c-l)B!n> 

c 

c - 

c  **♦  Variable  Declarations: 

real  bsum, x! 18 ) , ablu , bblu ,B! 18) ,D,Bnc,Bnc, BB!  18) ,aKS, bK S, 
1  aCVM,bCVM,aftD,bftD,aZftD,bZflD 

integer  n, c, nn, count! 4,5, 14) 
double  precision  dseed 

common  n, x , c , ablu , bblu , dseed, B, D, fine , Bnc , BB, aKS , bKS , 
aCVM, bCVM, aflD, bflD, nn, count , aZftD, bZflD 


1 


to  the  'nth  minus  one’  B  values 


c  Calculate  the  first  B  value 

B(l)=(l-(Z/(c*real(n>))) 

c  ***  Calculate  the  second  thru  the  nth  B  values 

do  10  ]=Z,n 

B( j )=B(j-l)*(l-(Z/(real(c)*(n-]+l)))) 

10  continue 

bsun=0 

c  ***  Sun  the  ’first’ 

do  Z0  k=l,(n-l ) 
bsun=bsun+B(  k ) 

Z0  continue 

c  **♦  Calculate  the  D  value 

0  =  (c+1)  *  bsun  +  (c-1)  *  B(n> 

return 
end 

Subroutine  BLCGTZ 


c  Purpose-  For  a  given  sample  size,  n,  and  a  specified  shape, 
c>Z,  BLCGTZ  calculates  the  best  linear  unbiased 
estimates  of  location  and  scale. 


c 

c 

c- 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 


Variables:  x 

c 
n 
B 
D 
Y 

ablu 
bblu 
Bx  sum 
nc 
count 


array  of  ordered  Pareto  variates 
shape  parameter 
sample  size 

array  of  B  values  used  to  calculate  the  blues 
D  value  used  to  calculate  the  blues 
Y  value  used  to  calculate  the  blues 
blu  for  location  parameter,  a 
blu  for  scale  parameter,  b 

sum  of  [B<i)  *  x(i)]  terms  for  i  =  l,Z...,(n-l) 
product  of  n  and  c 

array  of  counters  used  to  count  the  number  of 
valid  estimate  values  found 


c  Inputs'-  X  =  array  of  ordered  Pareto  variates 
c  c  =  shape  parameter 

c  n  =  sample  size 

c  B  =  array  of  B  values  used  to  calculate  blues 

c  0  =  D  value  used  to  calculate  the  blues 


c  Outputs:  ablu  =  blu  estimate  for  location,  a 
c  bblu  =  blu  estimate  for  scale,  b 

c - 

c  Calculate: 

c  Y  =  (c+l)[  B(l)x(l)  +  B(Z)x(Z)  +  ...  +  B(n-l)x(n-l)  ] 

c  +  ( c-1 )[  B< n)x( n)  ]  -  Dx( 1) 

c 

c  a  =  x(l)  -  Y/[(nc-l)(nc-Z)  -  One] 

c 

c  b  -  (nc-1)  I  x( 1 )  -  a  ] 

c 

- 

c  ♦**  Variable  Declarations: 
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real  x( 18) ,ablu,bblu,Bxsun,nc,Y,B( 18) , D, Anc, Bnc, BB( 18) 

, aKS, bKS, aCVM, bCVM, aAD, bAD, aZAD, bZAD 
integer  n,c,nn,count(4,5, 14) 
double  precision  dseed 

connon  n , x , c , ablu , bblu , dseed, B , D, Anc , Bnc , BB, aKS , bKS , 

1  aCVM, bCVM, aAD, bAD, nn, count , aZAD, bZAD 

Bxsun=0 

c  ***  Sun  the  products  of  the  B( i )  and  x(i>  values  to  i  =  n-1 

do  10  j=l,(n-l) 

Bxsun=Bxsun+B( j )*x(  j ) 

10  continue 

c  ***  Calculate  the  Y  value 

Y=( c+1 )*Bxsun+( c-1 )*B<  n)*x(  n)-D*x(  1 ) 
nc=n*c 

c  ***  Calculate  the  blu  estimates  for  location  and  scale 

ablu=x< 1 )-Y/<  <  nc-l )*( nc-Z )-(  nc*D) ) 
bblu=( nc-l )*( x(  1 )-ablu ) 

c  *»*  Increment  counters  for  valid  blues 

if  (bblu  -gt-  0)  then 

count< c, nn, 1 )  =  count( c, nn, 1 )  +  1 
count( c, nn,Z )  =  count! c, nn, Z )  +  1 
else 

print*, ’ablu=’ , ablu, ’  bblu=’ ,bblu, ’  negativity’ 
ablu  =  0 
bblu  =  0 
end  if 
return 
end 

Subroutine  KSMO 


c- 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 


Purpose-  KSMD  generates  the  minimum  distance  estimates  of 
location  and  scale  based  upon  minimizing  the 
Kolmogorov  distance  measure  defined  in  subroutine 
KDIS.  This  routine  uses  the  blu  estimates  as  the 
starting  points  for  the  estimate  modifications. 


Van  ablest  NPAR 

NSIG 
MAXFN 
I  OPT 
H,  G,  U 
lER 
F 

kse 

aKS 

bKS 

ablu 

bblu 

count 


=  number  of  parameters  altered  by  minimizing 
the  Kolmogorov  distance 

=  number  of  significant  digits  for  convergence 
=  maximum  number  of  function  evaluations 
=  options  selector  (see  ItlSL  manual  on  2XMIN) 

=  vectors  defined  in  IMSL  manual  on  ZXMIN 
=  error  parameter  (see  IMSL  manual  on  ZXMIN) 

=  value  of  Kolmogorov  distance  at  the  final 
parameter  estimates 

=  Kolmogorov  derived  minimum  distance  estimates 
=  Kolmogorov  minimum  distance  location  estimate 
=  Kolmogorov  minimum  distance  scale  estimate 
=  blu  estimate  of  location 
=  blu  estimate  of  scale 

=  array  of  counters  used  to  count  the  number  of 
valid  estimate  values  found 
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c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 


c 


Inputs:  NPftR 

NSIG 
MftXhN 
I  OPT 
kse 
ablu 
bblu 


number  of  parameters  altered  uhile  minimizing 
number  of  significant  digits  required 
maximum  number  of  function  evaluations 
options  selector  < see  IMSL  manual  on  ZXMIN) 
initial  estimates  for  the  minimization  process 
blu  estimate  of  location 
blu  estimate  of  scale 


Outputs:  F  =  minimum  value  of  the  function  being  minimized 

kse  =  revised  estimate  values 

aKS  =  revised  MO  estimate  of  location  laKS  =  kse(l)l 
bKS  =  revised  MD  estimate  of  scale  CbKS  =  kse(Z)] 

H,  G,  U  =  vectors  defined  in  IMSL  manual  on  ZXMIN 
lER  =  error  parameter  (see  IMSL  manual  on  ZXMIN) 


Calculate:  no  calculations  performed  in  this  subroutine 


*•*  Variables  Declaration; 

common  n , x , c , ablu , bblu , dseed, B, 0, Anc , Bnc , BB , aKS , bKS , 

1  aCVM, bCVM, aflD, bAD, nn, count , aZAD, bZAD 

external  kdis 

integer  NPAR,NSIG,MAXFN, lOPT, n, c, nn, count( 4, 5, 14) 
real  kse(  Z ) , H<  3) ,G(  Z ) ,U(  B) ,F, x( 18) , ablu , bblu , aKS, bKS, 

1  B(1 8 ) , D , Anc , Bnc , BB( 1 8 ) , aC  V  M , bC VM , aAD , bAD , aZ  AD , bZ  AD 

double  precision  dseed 
•**  Enter  the  ZXMIN  required  constants 
NPAR  =  Z 
NSIG  =  3 
MAXFh  =  500 
lOPT  =  0 


c  **♦  Initialize  the  kse  values  to  the  blu  estimates 

kse< 1 )  =  ablu 
kse(Z)  =  bblu 

c  •**  Call  ZXMIN  to  refine  the  kse  values  by  minimizing 

c  **♦  the  Kolmogorov  distance  (KST)  computed  in  the 

c  ***  subroutine  KDIS 

call  ZXMIN( KDIS, NPAR, NSIG, MAXFN,I0PT, kse, H,G,F,U,IER) 
c  *»*  Relabel  the  refined  estimates  of  location  and  scale 

aKS  =  kse( 1 ) 
bKS  =  kse(Z) 

c  ***  Increm,.'’!  the  KS  counters 

count( c,nn, 3)  =  count( c, nn, 3)  +  I 
count( c. nn,4)  =  count< c,nn,4)  +  1 
return 
end 

Subroutine  KDIS( NPAR, kse, F ) 


c - 

c  Purpose:  KDIS  provides  the  function  uhich  is  to  be  minimized 
c  by  ZXMIN  for  the  Kolmogorov  distance  measure.  The 

c  location  and  scale  parameters  are  altered  to  achieve 

c  this  minimization. 

c - 

c  Variables-  NPAR  =  number  of  parameters  available  to  alter 


n  =  sample  size 

kse  =  estimates  of  the  parameters  being  altered 
F  =  value  of  the  function  to  be  minimized 
X  =  array  of  ordered  Pareto  variates 
c  =  shape  parameter 
zi  =  array  of  Pareto  cdf  points 

DP  =  positive  differences  betueen  the  EOF  and  cdf 
DM  =  negative  differences  betueen  the  EDF  and  cdf 
DPLUS  =  maximum  positive  difference 
DMINUS  =  maximum  negative  difference 
KST  =  Maximum  of  DPLUS  and  DMINUS 


c  Inputs: 

NPAR  = 

number  of  parameters  available  to  alter 

c 

n  = 

sample  size 

c 

kse  = 

initial  estimates  (the  blu  estimates) 

c 

X  = 

array  of  ordered  Pareto  variates 

c 

c  = 

shape  parameter 

c  Outpxits- 
c 


value  of  the  function  at  the  final  estimates 
revised  estimates  of  location  and  scale; 
these  are  the  Kolmogorov  minimum  distance 
estimates 


c  Calculations: 


=  1  -  <1  +  [x{i)-a]/b>**(-c) 


DP<i)  =  ftBS[  i/n  -  z(i>  ) 


DM(i)  =  flBS[  z(i)  -  (i-l>/n  1 


»*•  variable  Declarations: 

common  n, x , c, ablu , bblu ,dseed, B, D, fine, Bnc, BB, aKS, bKS, 
aCVM, bCVM, aflD, Bad, nn, count , aZAD, bZflD 
integer  NPAR, n, c, nn, count( 4,5, 14) 
real  k5e(  NPAR) ,F,x(  18),zi( 18),DP( 18) , DM(18> , DPLUS, 

DMI NUS, KST, ablu , bblu , B( 18 ) ,D, fine, Bnc, BB(  18 ) , aKS,bKS 
aCVM,bCVM,aflD,bflD,aZAD,bZAD 
double  precision  dseed 

***  Calculate  the  Pareto  cdf  value  {zi<j))  at  each  point 
»♦*  and  the  differences  betueen  the  EDF  step  function 
♦**  and  the  cdf  points 
do  10  j  =  1 , n 

zi(])  =  l-( l/( l  +  ( x( j  )-kse{ 1 ) )/kse( Z ) ) )**c 
DP<j)  =  flBS<  j/reaK  n)  -  zi(3)) 

DM< J )  =  flBS(zi(j)  -  ( j-1 )/real< n) ) 
continue 

***  Select  the  maximum  of  the  plus  and  minus  differences 
DPLUS  =  MflX(DP<  1),DP<Z),DP(3),DP(4),DP(5),DP(G),DP(7) 
,DP(8),DP(9),DP<  10),DP(11),DP(1Z),DP(  13),DP(  14) 
,DP(15),DP(  1G),DP<  17),DP(18)) 

DMINUS  =  MftX(DM(l  ),DM(Z),DM(3),DM(4),DM(5),DM(G),DM(7) 
,DM(8),QM(9),DM< 10 ) ,DM( 11 ) , DM( IZ ) , DM( 13 ) , DM< 14 ) 


c 

c 

c 

c 

c 


,DM( 15),DM( 16),DM(17),DM<18)) 

♦  **  Select  the  piaximjpi  Kolmogorov  distance  measure  and 
*♦*  set  F  equal  to  that  distance.  F  becomes  the 

***  function  uhich  ZXMIN  attempts  to  minimize  by 

***  altering  the  values  of  the  location  and  scale 

**•  parameters 

KST  =  hflX(DPLUS,DMINUS) 

F  =  KST 

return 

end 

Subroutine  CVMMD 


c  Purpose:  CVMMD  generates  the  minimum  distance  estimates  of 
c  location  and  scale  based  upon  minimizing  the 

c  Cramer-von  Mises  distance  measure  defined  in  subroutine 

c  CVMDIS,  This  ro'itine  uses  the  blu  estimates  as  the 

c  starting  points  for  the  estimate  modif i cat ions - 


c  Variables: 

c 

c 

c 

c 

c  H, 

c 
c 
c 
c 
c 
c 
c 
c 
c 
c 


NPAR  =  number  of  parameters  altered  by  minimizing 
the  Cramei — von  Mises  (CVM)  distance 
NSIG  =  number  of  significant  digits  for  convergence 
MAXFN  =  maximum  number  of  function  evaluations 
lOPT  =  options  selector  (see  IMSL  manual  on  ZXMIN) 

G,  W  =  vectors  defined  in  IMSL  manual  on  ZXMIN 
lER  =  error  parameter  (see  IMSL  manual  on  ZXMIN) 

F  =  value  of  CVM  distance  at  the  final 
parameter  estimates 

cvme  =  CVM  derived  minimum  distance  estimates 
aCVM  =  CVM  minimum  distance  location  estimate 
bCVM  =  CVM  minimum  distance  scale  estimate 
ablu  =  blu  estimate  of  location 
bblu  =  blu  estimate  of  scale 
count  =  array  of  counters  used  to  count  the  number  of 
valid  estimate  values  found 


c  Inputs: 
c 
c 
c 
c 
c 
c 

c - 

c  Outputs: 
c 
c 
c 

c  H, 

c 

c - - 

c  Calculate: 
c 

c - 


NPftR 

NSIG 

MRXFN 

lOPT 

cvme 

ablu 

bblu 


number  of  parameters  altered  uhile  minimizing 
number  of  significant  digits  required 
maximum  number  of  function  evaluations 
options  selector  (see  IMSL  manual  on  ZXMIN) 
initial  estimates  for  the  minimization  process 
blu  estimate  of  location 
blu  estimate  of  scale 


F  =  minirium  value  of  the  functio’i  being  minimized 
cvme  =  revised  estimate  values 

aCVM  =  revised  MO  estimate  of  location  [ aCVM=cvrie(  1 )  1 
bCVM  =  revised  MD  estimate  of  scale  [ bCVM=cvme( 2 ) 1 

G,  U  =  vectors  defined  in  IMSL  manual  on  ZXMIN 
lER  =  error  parameter  (see  IMSL  manual  on  ZXMIN) 

no  calculations  performed  in  this  subroutine 


8Z 


c 


**•  Variables  Declaration^ 

common  n, x , c, ablu , bblu , dseed, B, D, fine, Bnc, BB, aKS, bKS, 

1  aCVM,  bCVIi,  aflD,  bAD,  nn,  count ,  aZAD,  bZAD 

external  cvmdis 

integer  NPAR,NSIG,MAXFN,I0PT,n,c,nn,count(4,5, 14) 
real  cvme(  Z ) , H( 3 ) ,G( Z ) , U( 6) ,F, x( 18 > , ablu ,bblu , aCVM, bCVM 
1  ,B(  18),D,Anc,Bnc,8B( 18),aKS,bKS,aAD,bAD,aZAD,bZAD 

double  precision  dseed 
c  *♦*  Enter  the  ZXMIN  required  constants 

NPAR  =  Z 
NSIG  =  3 
MAXFN  =  500 
lOPT  =  0 

c  ***  Initialize  the  cvme  values  to  the  blu  estimates 

cvme( 1 )  =  ablu 
cvne<Z)  =  bblu 

c  *♦*  Call  ZXMIN  to  refine  the  cvme  values  by  minimizing 

c  ***  the  CVM  distance  (UZ>  computed  in  the 

c  ***  subroutine  C'*MDIS 

call  ZXMIN< CVMDIS, NPAR, NSIG, MAXFN, lOPT, cvme, H,G,F,U,IER) 
c  *•*  Relabel  the  refined  estimates  of  location  and  scale 

aCVM  =  cvme( 1 ) 
bCVM  =  cvme<  Z ) 

c  Increment  the  CVM  counters 

count( c, nn,5)  =  count( c, nn,5)  +  1 
count(c,nn,G)  =  count(c,nn,G)  +  1 
return 
end 

Subroutine  CVMDIS( NPAR, cvme, F ) 


c  Purpose^  CVMDIS  provides  the  function  uhich  is  to  be  minimized 
c  by  ZXMIN  for  the  Cramer-von  Mises  distance  measure, 

c  The  location  and  scale  parameters  are  altered  to 

c  achieve  this  minimization. 


c  Variables: 

c 

c 

c 

c 

c 

c 

c 

c 


c 

UZ 

c -  - 

c  Inputs: 

NPAR 

c 

n 

c 

cvne 

c 

x 

c 

c 

NPAR  =  number  of  parameters  available  to  alter 
n  =  sample  size 

cvme  =  estimates  of  the  parameters  being  altered 
F  =  value  of  the  function  to  be  minimized 
X  =  array  of  ordered  Pareto  variates 
c  =  shape  parameter 
zi  =  array  of  Pareto  cdf  points 
ACV  =  the  squared  quantity  in  the  UZ  formula 
SeVM  =  the  sum  of  the  ACV  quantities 
the  CVM  distance  measure 


number  of  parameters  available  to  alter 
sample  size 

initial  estimates  (the  blu  estimates) 
array  of  ordered  Pareto  variates 
shape  parameter 
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c  Outputs- 

c 

c 

(, - 

c  Calculations 

c 

c 


F  =  value  of  the  function  at  the  final  estimates 
cvme  =  revised  estimates  of  location  and  scale; 

these  are  the  CVM  minimum  distance  estimates 


c 

c 

c 

c 

c 

c 

c- 

c 


z(i)  =  1  -  (1  +  [r.(i)-al/b)**(-c) 
ftCV<i)  =  [  2(1)  -  (2i-l)/2n  1**2 

SCVM  =  ftCV(l)  +  flCV<2)  +  ...  +  ftCV(n> 
U2  =  SCVM  +  l/12n 


***  Variable  Declarations^ 
common  n , x , c , ablu , bblu , dseed , B, D, Bnc , Bnc , BB, aK S , bKS , 

1  aCVM, bCVM, aflD, bAD, nn, count , a2AD, b2AD 

integer  NPAR, n, c, nn, count! 4,5, 14) 
real  cvme( NPAR) ,F , x( 18) , zi( 18) , SCVM, ACV< 18), U2 
1  , ablu, bblu, B( 18) ,0,Anc,Bnc,BB( 18) , aKS, bKS, aCVM,bCVM, 

1  aAO, bAD, a2AD, b2AD 

double  precision  dseed 
SCVM  =  0 
do  10  j  =  1 , n 

2i(’)  =  l-(  l/(  l  +  ( x( 3 )-cvme( 1 ) )/cvme( 2 ) ) )**c 
ACV<j)  =  <zi<3)  -  (2*3-l)/<2*real(n)))**2 
SCVM  =  SCVM  +  ACV<j) 

10  continue 

U2  =  SCVM  +  (l/(12*real(n))) 

F  =  U2 
return 
end 

Subroutine  ADMD 


c - 

c  Purpose V 

c 

c 

c 

c 

c - 

c  Variables- 

c 

c 

c 

c 

c  H, 

c 

c 

c 

c 


ADMD  generates  the  minimum  distance  estimates  of 
the  location  parameter  based  upon  minimizing  the 
Anderson-Darling  distance  measure  defined  in 
subroutine  ADDIS.  ADMD  uses  the  blu  estimates  as  the 
starting  points  for  the  estimate  modifications. 


NPAR  =  number  of  parameters  altered  by  minimizing 
the  Anderson-Darling  (A-D)  distance 
NSIG  =  number  of  significant  digits  for  convergence 
MAXFN  =  maximum  number  of  function  evaluations 
lOPT  =  options  selector  (see  IMSL  manual  on  ZXMIN) 
G,  U  =  vectors  defined  in  IMSL  manual  on  ZXMIN 
lER  =  error  parameter  (see  IMSL  manual  on  ZXMIN) 

F  =  value  of  A-D  distance  at  the  final 
parameter  estimates 

ade  =  A-D  derived  minimum  distance  estimate 
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aHU  =  ft-P  nininun  distance  location  estimate 
ablu  =  blu  estimate  of  location 
bblu  =  blu  estimate  of  scale 
count  =  array  of  counters  used  to  count  the  number  of 
valid  estimate  values  found 
anda  =  array  of  calculated  fl-D  distance  measures 
esta  =  array  of  location  estimates  used  to  minimize 
the  A-D  distance  measure 
icnt  =  counter  for  the  number  of  estimates  used 


c  Inputs:  NPAR  =  number  of  parameters  altered  uhile  minimizing 

c  NSIG  =  number  of  significant  digits  required 

c  MftXFN  =  maximum  number  of  function  evaluations 

c  lOPT  =  options  selector  (see  IMSL  manual  on  ZXHIN) 

c  ade  =  initial  estimate  for  the  minimization  process 

c  ablu  =  blu  estimate  of  location 

c  bblu  =  blu  estimate  of  scale 


c  (Xitputs-  F  =  minimum  value  of  the  function  being  minimized 

c  ade  =  revised  estimate  values 

c  aAD  =  revised  tlD  estimate  of  location  I  aftD  =  ade(i)} 

c  H,  G,  U  =  vectors  defined  in  IMSL  manual  on  ZXMIN 

c  lER  =  error  parameter  (see  IMSL  manual  on  ZXMIN) 


c  Calculate::  no  calculations  performed  in  this  subroutine 


c  ***  Variables  Declaration: 

common  n , x , c , ablu , bblu , dseed, B, 0, Anc , Bnc, BB, aKS, bKS , 

1  aCVM, bCVM, aAD, bAD,nn, count , a2AD,bZAD, anda, esta, lent 

1  , andb, estb, icntb, andab, estaa, estbb, icntab 

external  addis 

integer  NPAR, NSIG, MAXFN, lOPT, n,c, nn, count (  4,5,14) , icnt , hh 
1  , icntb, icntab 

real  ade( l),H(l),G(l),U(3),F,x(18),ablu,bblu, aAD, bAD 
1  ,B( 18),D,Anc,Bnc,BB( 18) , aKS,bKS, aCVM, bCVM,  aZAD, bZAD 

1  , anda(  500) , esta(  500) , ADI ,andb(  500 ) , estb(  500) , ardab(  500) 

1  ,estaa( 500) ,estbb(500) 

double  precision  dseed 

c  ***  Enter  the  ZXMIN  required  constants 

NPAR  =  1 
NSIG  =  3 
MAXFN  =  500 
lOPT  =  0 

c  **»  Initialize  the  ade  value  to  the  blu  estimate 

ade(  1 )  =  ablu 

c  *•»  Call  ZXMIN  to  refine  the  ade  values  by  minimizing 

c  ***  the  Anderson-Darling  distance  (AD)  computed  in 

c  ***  the  subroutine  ADDIS 

call  ZXMIN(  ADDIS, NPAR, NSIG, MAXFN, lOPT, ade, H,G,F,U,IER) 
aAD  =  ade( 1 ) 

c  ***  Reinitialize  the  icnt,  anda,  and  esta  arrays 
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"o' 


25  lent  =  0 

do  30  1  =  1,U(Z) 

anda( i )  =  0 
esta( 1 )  =  0 
30  continue 

c  »»*  Increment  ftO  counter  for  valid  flD  estimates 

count< c, nn,7)  =  count! c,nn,7)  +  1 
return 
end 

Subroutine  ADDIS!  NPftR,ade,F) 


c- 

c 

c 

c 

c 

c- 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

C- 

c 

c 

c 

c 

c 

C- 

c 

c 

c 

c- 

c 


Purpose-  ADDIS  provides  the  function  which  is  to  be  minimized 
by  ZXMIN  for  the  Anderson-Darling  distance  measure. 

The  location  parameter  is  altered  to  achieve  the 
minimization. 

Variables:  NPAR  =  number  of  parameters  available  to  alter 

n  =  sample  size 

ade  =  estimates  of  the  parameter  being  altered 
F  =  value  of  the  function  to  be  minimized 
X  =  array  of  ordered  Pareto  variates 

c  =  shape  parameter 

zi  =  array  of  Pareto  cdf  points 

AAA  =  array  of  terms  to  be  summed  in  AD  formula 
SAAD  =  sum  of  the  AAA! i >  terms 

AD  =  Anderson-Darling  distance  measure 
anda  =  array  of  calculated  A-D  distance  measures 

esta  =  array  of  location  estimates  used  to  minimize 

the  A-D  distance  measure 
lent  =  counter  for  the  number  of  estimates  used 

Inputs:  NPAR  =  number  of  parameters  available  to  alter 

n  =  sample  size 

ade  =  initial  estimate  !the  blu  estimate) 

X  =  array  of  ordered  Pareto  variates 

c  =  shape  parameter 


Outputs : 


Calculations: 


c 

z!  1 ) 

c 

c 

AAA! 1 ) 

c 

c 

SAAD 

c 

c 

AD 

c 

c - 

F  =  value  of  the  function  at  the  final  estimate 
ade  =  revised  estimate  of  location;  this  is  the 
Anderson-Darling  minimum  distance  estimate 


1  -  !1  +  [ x! i >-aJ /b)**! -c ) 

!Zi-l)  1  In  z!i)  +  In  !  l-z!n+l-i)  )  1 
AAA!1)  +  AAA!Z)  +  ...  +  AAA!n) 

!-SAAD)/n  -  n 
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c 


***  Variable  Declarations- 

connon  n, x , c, ablu ,bblu ,dseed, B,0, Bnc, Bnc, BB, aKS, bKS, 

1  aCVM, bCVM, aflD, bAD, nn, count , aZAD, bZAD, anda, esta, icnt 

1  ,andb,estb, icntb,andab,estaa,estbb, icntab 

integer  NPAR, n, c, nn,count< 4, 5, 14) , icnt , icntb, icntab 
real  ade< NPAR) , F , x( 18) ,2i( 18>,Afta(  18),SftAD,ftD 
1  , ablu , bblu , B( 1 8 ) , D, fine , Bnc , BB(1 8 ) , aKS , bKS , aCVM , bCVM , 

1  aAO,  bAD,  aZ  AD,  bZ  AO,  anda<  S0I9 ) ,  esta(  500 ) 

1  , andb<  500 ) , estb(  500 ) , ancab(  500 ) , estaa(  500 ) , estbb(  500 ) 

double  precision  dseed 
do  10  j=l,n 

c  ***  Calculate  the  Pareto  cdf  point  values 

2i<j)  =  l-(l/(l+(x<j)-ade(l))/bblu))**c 
c  ***  Test  2i(j)  and  t  1  -  zi(j)  1  for  negativity 

if  (2i(j).le.0  .or.  2i<j).ge.l)  then 
go  to  30 
end  if 

10  continue 

SAAD  =  0 

c  ***  Calculate  the  Anderson-Darling  distance 

do  Z0  n=l,n 

AAA(n)  =  (Z*n-1)  *  (log(zi(n))  +  log( l-2i( n+l-n) ) ) 
SAAD  =  SflAD  +  AAA(n) 

Z0  continue 

()D  =  (-1)  *  (n  +  SAAD/n) 
c  *•*  Save  the  AD  and  ade( 1 )  values 

lent  =  icnt  +  1 
anda<icnt)  =  AD 
esta(icnt)  =  ade( 1 ) 
c  **»  Relabel  the  A-D  distance 

F  =  AD 
go  to  40 

30  ade( 1 )  =  esta<icnt-l) 

40  return 

end 

Subroutine  ADBMD 


c  Purpose^  ADBMD  generates  the  mininun  distance  estimates  of 
c  the  scale  parameter  based  upon  minimizing  the 

c  Anderson-Darling  distance  measure  defined  in 

c  subroutine  ADBDIS.  ADBMD  uses  the  blu  estimates  as 

c  starting  points  for  the  estimate  modifications. 


c 

c 

c 

c 

c 

c 

c 

c 


Variables:  NPAR  =  number  of  parameters  altered  by  minimizing 
the  Anderson-Darling  (A-D)  distance 
NSIG  =  number  of  significant  digits  for  convergence 
MAXFN  =  maximum  number  of  function  evaluations 
lOPT  =  options  selector  (see  IMSL  manual  on  ZXMIN) 
H,  G,  U  =  vectors  defined  in  IMSL  manual  on  ZXMIN 
lER  =  error  parameter  (see  IMSL  manual  on  ZXMIN) 

F  =  value  of  A-D  distance  at  the  final 


c  parameter  estimates 

c  ade  =  fl-D  derived  minimum  distance  estimate 

c  bftD  =  ft-D  minimum  dis'ance  scale  estimate 

c  ablu  =  blu  estimate  of  location 

c  bblu  =  blu  estimate  of  scale 

c  count  =  array  of  counters  used  to  count  the  number  of 

c  valid  estimate  values  found 

c  andb  =  array  of  calculated  fl-D  distance  measures 

c  estb  =  array  of  scale  estimates  used  to  minimize 

c  the  fl-D  distance  measure 

c  icntb  =  counter  for  the  number  of  estimates  used 

c - 

c  Inputs:  NPflR  =  number  of  parameters  altered  uhile  minimizing 

c  NSIG  =  number  of  significant  digits  required 

c  MflXFN  =  maximum  number  of  function  evaluations 

c  lOPT  =  options  selector  (see  IMSL  manual  on  ZXMIN) 

c  ade  =  initial  estimate  for  the  minimization  process 

c  ablu  =  blu  estimate  of  location 

c  bblu  =  blu  estimate  of  scale 

c - 

c  Outputs:  F  =  minimum  value  of  the  function  being  minimized 

c  ade  =  revised  estimate  values 

c  bflD  =  revised  MD  estimate  of  scale  IbflD  =  ade(l)] 

c  H,  G,  U  =  vectors  defined  in  IMSL  manual  on  ZXMIN 

c  lER  =  error  parameter  (see  IMSL  manual  on  ZXMIN) 

(3 - 

c  Calculate:  no  calculations  performed  in  this  subroutine 


Variables  Declaration: 

common  n, x , c, ablu , bblu , dseed, B,D,flnc,Bnc, BB,aKS , bKS, 
aCVM,bCVM, aflD, bflD, nn, count , aZflD, bZflD, anda,esta, icnt 
, andb, estb, icntb, andab,estaa,estbb, icntab 
external  adbdis 

integer  NPflR, NSIG, MflXFN, lOPT, n, c, nn, count(  4,5,14), lent 
, icntb, icntab, hhh 

real  ade(  1 ) , H(  1 )  ,G(  1  > , U(  3) , F, x(  18 ) , ablu , bblu , aflD, bflD 
, B( 1 8 ) , D, flne, Bnc, BB(  1 8 ) , aKS , bKS , aCVM, bCVM, aZflD, bZflD 
, anda( 500 ) , esta( 500 ) , andb( 500 ) , estb( 500 ) , andab( 500 ) , 
estaa( 500 ) , estbb(  500 ) , flOZ 
double  precision  dseed 

Enter  the  ZXMIN  required  constants 
NPflR  =  1 
NSIG  =  3 
MflXFN  =  500 
lOPT  =  0 

»»*  Initialize  the  ade  value  to  the  blu  estimate 
ade(  1 )  =  bblu 

»**  Call  ZXMIN  to  refine  the  ade  values  by  minimizing 
•»*  the  flnderson-Darling  distance  (flD)  computed  in 

***  the  subroutine  ADBDIS 

call  ZXMIN( ADBDIS, NPflR, NSIG, MflXFN, lOPT, ade, H,G,F,U,IER) 
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c 


Z5 


30 

c 


bP.D  =  ade<l) 

***  Reinitialize  the  icntb,  andb,  and  estb  arrays 
icntb  =  0 
do  30  1  =  1,U(Z) 
andb( i )  =  0 
e3tb<i)  =  0 
continue 

•**  Increment  RD  counter  for  valid  RD  estimates 

count< c,nn,8)  =  count( c, nn, 8)  +  1 

return 

end 

Subroutine  RDBDIS( NPRR, ade,F ) 


Purpose^  RfDBDIS  provides  the  function  uhich  is  to  be  minimized 
by  ZXMIN  for  the  Rnderson-Darling  distance  measure. 
The  scale  parameter  is  altered  to  achieve  the 
minimization. 

Variables:  NPRR  =  number  of  parameters  available  to  alter 

n  =  sample  size 

ade  =  estimates  of  the  parameter  being  altered 
F  =  value  of  the  function  to  be  minimized 
X  =  array  of  ordered  Pareto  variates 
c  =  shape  parameter 
zi  =  array  of  Pareto  i  'f  points 

RRR  =  array  of  terms  to  be  summed  in  RD  formula 
SRRD  =  sum  of  the  RRR( i >  terms 

RD  =  Rnderson-Oarling  distance  measure 
andb  =  array  of  calculated  R-D  distance  measures 
estb  =  array  of  scale  estimates  used  to  minimize 
the  R-D  distance  measure 
icntb  =  counter  for  the  number  of  estimates  used 

Inputs:  NPRR  =  number  of  parameters  available  to  alter 

n  =  sample  size 

ade  =  initial  estimate  (the  blu  estimate) 

X  =  array  of  ordered  Pareto  variates 
c  =  shape  parameter 


Outputs: 


Calculations: 


c 

z(l) 

c 

c 

RRR( 1 ) 

c 

c 

SRRD 

c 

c 

RD 

F  =  value  of  the  function  at  the  final  estimate 
ade  =  revised  estimate  of  scale;  this  is  the 

Rnderson-Darling  minimum  distance  estimate 


1  -  (1  +  [x( 1 )-a]/b>**(-c) 

(Zi-1)  (  In  z(i)  +  In  (  l-z(n+l-i;  )  1 
RRR(l)  +  RRR(Z)  +  ...  +  RRR(n> 

(-SRRD>/n  -  n 
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c 

c 


*•»  Variable  Declarations: 

connon  n, x , c , ablu , bblu , dseed, B, 0, Bnc, Bnc, BB, aKS , bKS , 

1  aCVM, bCVM, aflD, bflO, nn, count , aZBD, bZflD, anda,  esta, icnt 

1  ,andb,estb,  icntb,andab,estaa,estbb, icntab 

integer  NPAR, n, c, nn, count! 4,5, 14 ) , icnt , icntb, icntab 
real  ade( NPftR ),F,x(18),2i(18), flflfi!  18), SARD, AD 
1  , ablu , bblu , B( 18) ,D, Anc,Bnc,BB( 18 ) , aKS, bKS, aCVM, bCVM, 

1  aAO,bAD,aZAD,bZAD 

1  , anda(  500) , esta(  500) , andb( 500) , estb( 500) ,andab( 500) , 

1  estaa(  500) , estbb!  500) 

double  precision  dseed 
do  10  j=l,n 

c  *»*  Calculate  the  Pareto  cdf  point  values 

2i{j)  =  l-(  l/( l+( x< j )-ablu )/ade(  1 ) ) )**c 
c  •**  Test  2i<j)  and  t  1  -  2i(j)  1  for  negativity 

if  <2i<j).le.0  .or.  2i(j).ge.l)  then 
go  to  30 
end  if 
10  continue 

SARD  =  0 

c  »**  Calculate  the  Anderson- Dari  mg  distance 

do  Z0  m=l,n 

AAA{n)  =  (Z»fi-1)  •  (log<2i(n))  +  log(  l-2i(  n+l-n) ) ) 

SARD  -  SARD  +  ARA(n) 

Z0  continue 

AD  -  ( - i )  •  !  n  +  SARD/ n ) 
c  ***  Save  the  AD  and  ade( 1 >  values 

icntb  -  icntb  t  i 
andb( icntb)  -  Au 
estbv icntb)  -  ade\ i  ) 
c  •••  Relabel  the  A-D  distance 

r  -  Au 
go  to  40 

50  adev i )  -  estbi icntb- i  ) 

40  return 

end 

Subroutine  ADZMD 

Q - 

c  Purpose-  ADZinu  generates  the  nininun  distance  estimates  of 
c  location  and  scale  simultaneously,  based  on  minini2ing 

c  the  Anderson-Darling  distance  measure  defined  in 

c  subroutine  ADZDIS.  ADZnD  uses  the  blu  estimates  as  the 

c  starting  points  for  the  estimate  modifications. 

- 

c  Variables-  NFAR  -  number  of  parameters  altered  by  minimi2ing 
c  the  Anderson-Darling  xA-D)  distance 

c  N5I6  -  number  of  significant  digits  for  convergence 

c  MAXFN  -  maximum  number  of  function  evaluations 

c  lOFT  -  options  selector  i  see  IriSL  manual  on  ZaHIN) 


c 

H,  G,  U 

c 

icn 

c 

F 

c 

c 

ade 

c 

aZfiD 

c 

bZnD 

c 

ablu 

c 

bblu 

c 

count 

c 

c 

andab 

c 

estaa 

c 

c 

esxbb 

c 

c 

icntab 

C“ 

c 

Inputs  - 

NFnR 

c 

niSIG 

c 

nnXF  hi 

c 

iurT 

c 

ade 

c 

ablu 

c 

bblu 

c- 

— 

c 

Oitputs 

F 

c 

ade 

c 

aZnD 

c 

bZnD 

c 

n,  o,  W 

c 

c- 

lER 

c 

c- 

c 


uaicuiaxe- 


vecxors  defined  in  IfiSL  nanual  on  ZArilri 
error  parameter  \see  IriSL  manual  on  ZXHIN) 
value  of  ri-D  distance  at  the  final 
parameter  estimates 

fi-D  derived  minimum  distance  estimate 
n-D  minimum  distance  location  estimate 
fi-u  minimum  distance  scale  estimate 
blu  estimate  of  location 
blu  estimate  of  scale 

array  of  counters  used  to  count  the  number  of 
valid  estimate  values  found 
array  of  calculated  n-D  distance  measures 
array  of  location  estimates  used  to  minimize 
the  n-D  distance  measure 

array  of  scale  estimates  used  to  minimize 
the  n-D  distance  measure 
counter  for  the  number  of  estimates  used 

hirnn  -  number  of  parameters  altered  uhile  minimizing 
riSIG  -  number  of  significant  digits  required 

-  maximum  number  of  function  evaluations 

-  options  selector  (see  IriSL  manual  on  ZXnIN) 
ade  -  initial  estimate  for  the  minimization  process 

ablu  -  blu  estimate  of  location 
bblu  -  blu  estimate  of  scale 

F  -  minimum  value  of  the  function  being  minimized 
ade  -  revised  estimate  values 

revised  KD  estimate  of  location  laZftD  -  ade(l)j 
bZnu  -  revised  KD  estimate  of  scale  ibZnD  -  ade(Z>j 
vectors  defined  in  InSL  manual  on  ZXnIpi 
icH  -  error  parameter  (see  InSL  nanual  on  ZXnIN) 

no  calculations  performed  in  this  subroutine 


•  ••  variables  Declaration-. 

common  n, x , c, ablu , bblu ,dseed, B, D, nnc, 5nc, BB, aXS, bXS, 

1  aCvn,  bCvn ,  afiD,  bnD,  nn,  count ,  aZRD,  bZRD,  anda,  esta,  i cnt 

1  , andb, estb, icntb, andab, estaa, estbb, icntab 

external  adZdis 

integer  NFnR.NSIG, nftXFN, IGFT, n, c, nn, count( 4, S,  14 ) 

1  , lent , icntb, icntab, hhhh 

real  ade(  Z  ) , K(  3)  ,G(  Z  > , w(  B)  ,F, x(  IS) , ablu , bblu , afiD, bnD 
1  ,  B(  15) ,  D,nnc,Bnc,5B(  18)  ,aKS,  bKS,  aCvn,  bCvn,  aZfiD,  bZnD 

1  , anda(  SSG) , esta(  5»S) , andb( SSu) , estb(  5SS) ,andab( 500) , 

1  estaa(500)  ,estbb(  500),fiD3 

double  precision  dseed 
*»•  Enter  the  ZXHIN  required  constants 
NPRR  =  Z 
NSIG  =  3 
MflXFN  =  500 
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c 


lOFT  -  0 

*•*  Iniiiaiize  the  ade  value  to  the  biu  estimate 
adei i  )  -  abiu 
ade( Z  )  -  bbiu 

c  Cali  ZXMIN  to  refine  the  ade  values  by  nininizing 

c  *»*  the  finder son- Car ling  distance  (fiD)  computed  in 

c  ***  the  subroutine  fiDZDIS 

cal 1  ZXMI N(  ADZ  DI S , NPfiR , NSI6, MAXFN , I OPT , ade , H , G , F , U , I ER ) 
aZAD  =  ade(l) 
bZfiD  =  ade(Z) 

c  ***  Reinitialize  the  icntab,  andab,  estbb,  and  estaa  arrays 

ZB  icniab  -  0 

do  30  i  =  1,U(Z) 
andab< i )  =  0 
estaa< i )  -  0 
estbbC i  )  -  0 
30  continue 

c  ***  Increment  AD  counter  for  valid  fiO  estimates 

counti c, nn, 3 i  =  count< c, nn, 9 >  +  1 
count( c, nn, 10)  =  count( c, nn, 10)  +  i 
return 
end 

Subroutine  fiDZDI3(riFflR,ade,F) 


c  Purpose-  fiDZDIS  provides  the  function  uhich  is  to  be  minimized 
c  by  ZXMIN  for  the  finderson-Darling  distance  measure, 

c  The  location  and  scale  parameters  are  both  altered  to 

c  achieve  the  minimization. 


c  Vanables- 

NPfiR 

= 

c 

n 

= 

c 

ade 

r 

c 

F 

= 

c 

X 

r 

c 

c 

= 

c 

Zl 

= 

c 

fififi 

= 

c 

SAfiD 

= 

c 

AD 

= 

c 

andab 

= 

c 

estaa 

r 

c 

c 

estbb 

= 

c 

c 

icntab 

= 

NPfiR  =  number  of  parameters  available  to  alter 
n  =  sample  size 

ade  =  estimates  of  the  parameter  being  altered 
F  =  value  of  the  function  to  be  minimized 
X  =  array  of  ordered  Pareto  variates 
c  =  shape  parameter 
zi  =  array  of  Pareto  cdf  points 
fififi  =  array  of  terms  to  be  summed  in  fiD  formula 
SAfiD  =  sum  of  the  AfiA( i )  terms 

AD  =  finderson-Darling  distance  measure 
indab  =  array  of  calculated  fi-D  distance  measures 
fstaa  =  array  of  location  estimates  used  to  minimize 
the  fi-D  distance  measure 

(stbb  =  array  of  scale  estimates  used  to  minimize 
the  fi-D  distance  measure 
intab  =  counter  for  the  number  of  estimates  used 


c  Inputs: 
c 


number  of  parameters  available  to  alter 
sample  size 

initial  estimate  (the  blu  estimate) 
array  of  ordered  Pareto  variates 
shape  parameter 


c  Outputs-  F  =  value  of  the  function  at  the  final  et tinate 

c  ade  =  revised  estimates  of  location  and  scale; 

c  these  are  the  ftnderson-Darling  nininun 

c  distance  estimates 


c  Calculations: 
c  2(  i ) 

c 

c  i ) 

c 

c  SflAD 

c 

c  flO 

c 

c - 

c  •**  Variable  Declarations: 

common  n, x , c , ablu , bblu , dseed, B, 0, Rnc , Bnc, BB, aKS, bKS, 

1  aCVM, bCVM, aflD.bAD, nn, count , aZRD, bZRO, anda, esta, lent 

1  ,andb,estb, icntb,andab,estaa,estbb, ientab 

integer  NPflR, n, c, nn, count! 4,5, 14) , icnt , icntb, i entab 
real  ade(NPflR),F,x( 18),2i( 18),SftAD,ftD 
1  , ablu, bblu, B( 18) ,D,Anc,Bnc,BB(  18) ,aKS,bKS, aCVM,bCVM, 

1  aRO, bAD, aZA0,bZA0, anda( 580) , esta(  500) , andb(  500) , 

1  estb( 500) , andab<  500) , estaa( 500) , estbb(  500 ) 

double  precision  dseed 
do  10  j=l,n 

c  ***  Calculate  the  Pareto  cdf  point  values 

2i(j)  =  l-<l/(l+<x<j )-ade(i))/ade(Z)))»*c 
c  Test  2i(j)  and  [  1  -  2i(j)  1  for  negativity 

if  (2i(j).le.0  .or.  2i(j).ge.l)  then 
go  to  30 
end  if 

10  continue 

SARD  =  0 

c  Calculate  the  Anderson-Darling  distance 

do  Z0  m=l,n 

AAA(m)  =  (Z*m-1)  •  <log(2i(m))  +  log! l-2i( n+l-m) ) ) 
SARD  =  SARD  +  AAA(n) 

Z0  continue 

AD  =  (-1)  *  (n  +  SAAO/n) 
c  »**  Save  the  AD  and  ade( 1 )  values 

icntab  =  icntab  +  1 

andab! icntab)  =  AD 

estaa( icntab)  =  ade< 1 ) 
estbb( icntab)  =  ade( Z ) 
c  ***  Relabel  the  A-D  distance 

F  =  AD 


go  to  40 

30 

ade( 1 )  = 

estaa( icntab-1 ) 

ade( Z )  = 

estbb( icntab-1 ) 

40 

return 

end 

=  1  -  (1  +  tx<i)-a]/b)**(-c) 

=  (Zi-1)  t  In  2(i)  +  In  (  l-2(n+l-i)  )  1 
=  AAA(l)  +  AAA(Z)  +  ...  +  AAA(n) 

=  <-SAAD)/n  -  n 
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Subroutine  CVflMD 


c  Purpose-  CVftMD  generates  the  nininun  distance  estimates  of 
c  the  location  parameter  based  upon  minimizing  the 

c  Cramer-von  Mises  distance  measure  defined  in 

c  subroutine  CVflDIS.  CVAMD  uses  the  blu  estimates  as  the 

c  starting  points  for  the  estimate  modifications. 


c  Variables:  NPflR  = 
c 

c  NSIG  = 

c  MflXFN  - 

c  lOPT  = 

c  H,  G,  U  = 

c  lER  = 

c  F  = 


alCV 

ablu 

bblu 

count 


number  of  parameters  altered  by  minimizing 
the  Cramer-von  Mises  <CVM)  distance 
number  of  significant  digits  for  convergence 
maximum  number  of  function  evaluations 
options  selector  (see  IMSL  manual  on  ZXMIN) 
vectors  defined  in  IMSL  manual  on  ZXMIN 
error  parameter  (see  IMSL  manual  on  ZXMIN) 
value  of  CVM  distance  at  the  final 
parameter  estimates 

CVM  derived  minimum  distance  estimate 
CVM  minimum  distance  location  estimate 
blu  estimate  of  location 
blu  estimate  of  scale 

array  of  counters  used  to  count  the  number  of 
valid  estimate  values  found 


c  Inputs-  NPflR  =  number  of  parameters  altered  uhile  minimizing 

c  NSIG  =  number  of  significant  digits  required 

c  MftXFN  =  maximum  number  of  function  evaluations 

c  lOPT  =  options  selector  (see  IMSL  manual  on  ZXMIN) 

c  cvme  =  initial  estimate  for  the  minimization  process 

c  ablu  =  blu  estimate  of  location 

c  bblu  =  blu  estimate  of  scale 


- 

c  Outputs:  F  =  minimum  value  of  the  function  being  minimized 

c  cvme  =  revised  estimate  values 

c  alCV  =  revised  MD  estimate  of  location  [ alCV=cvme(  1 ) 1 

c  H,  G,  U  =  vectors  defined  \n  IMSL  manual  on  ZXMIN 

c  lER  =  error  parameter  (see  IMSL  manual  on  ZXMIN) 

^ - 

c  Calculate'  no  calculations  performed  in  this  subroutine 
- 

c  *»*  Variables  Declaration: 

common  n, x , c, ablu , bblu ,dseed, B, D, fine, Bnc, BB, aKS, bKS, 

1  aCVM, bCVM, aflO, bflO, nn, count , aZflD, bZftD, anda, esta, lent 

1  , andb, estb, icntb, andab.estaa, estbb, icntab 

1  ,alCV,blCV,alKS,blKS 

external  CVRDIS 

integer  NPftR, NSIG, MflXFN, lOPT, n,c, nn, count( 4, 5, 14) , ient , hh 
1  , icntb, icntab 

real  cvme( 1 ) , H( 1 ) ,G( 1 ) , U( 3) ,F, x( 18 ) , ablu , bblu , aflD, bflD 
1  , B( 1 8 ) , D, flne, Bnc, BB(  1 8 ) , aKS, bKS, aCVM, bCVM, aZflD, bZflD 

1  , anda( 500) , esta( 500) , flDl , andb( 500) , estb( 500) , andab( 500) 
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,estaa( 5e0),estbb( 500),alCV,blCV,alKS,biKS 
double  precision  dseed 
***  Enter  the  ZXMIN  required  constants 
NPftR  =  1 
NSIG  =  3 
MflXFN  =  500 
lOPT  =  0 

»**  Initialize  the  cvne  value  to  the  blu  estimate 
cvne(  1 )  =  ablu 

**»  Call  ZXHIN  to  refine  the  cvne  values  by  nininizing 
***  the  Craner-von  Mises  distance  ( UZ )  computed  in 

•**  the  subroutine  CVADIS 

call  ZXMIN<CVflDIS,NPAR, NSIG, MftXFN.IOPT, cvne, H,G,F,U,IER) 
»**  Relabel  the  refined  estimates  of  location 
alCV  =  cvne( 1 ) 

***  Increment  the  alCV  counter 
count< c, nn, 11 )  =  count! c,nn, 11 )  +  1 
return 
end 

Subroutine  CVflDIS! NPftR, cvne, F) 


c  Purpose:  CVftDIS  provides  the  function  which  is  to  be  minimized 
c  by  ZXfllN  for  the  Cramer-von  Mises  distance  measure, 

c  The  location  parameter  is  altered  to 

c  achieve  this  minimization. 


c  Variables: 


number  of  parameters  available  to  alter 
sample  size 

estimates  of  the  parameters  being  altered 

value  of  the  function  to  be  minimized 

array  of  ordered  Pareto  variates 

shape  parameter 

array  of  Pareto  cdf  points 

the  squared  quantity  in  the  UZ  formula 

the  sum  of  the  ftCV  quantities 

the  CVM  distance  measure 


c  Inputs: 


number  of  parameters  available  to  alter 
sample  size 

initial  estimates  (the  blu  estimates) 
array  of  ordered  Pareto  variates 
shape  parameter 


c  Outputs: 


value  of  the  function  at  the  final  estimates 
revised  estimates  of  location  and  scale; 
these  are  the  CVM  minimum  distance  estimates 


c  Calculations: 


=  1  -  (1  +  [x( 1 )-al/b)»*( -c) 


c 

c 

c 


ftCV(i)  =  I  z(i)  -  (Zi-D/Zn  ]**Z 


c  SCVM  =  flCV(l)  +  flCV(Z)  +  ...  +  flCV(n) 

c 

c  U2  =  SCVM  f  1/lZn 

c 

(, - 

c  »•*  Variable  Declarations: 

connon  n , x , c , ablu , bblu , dseed , B , Rnc , Bnc , BB , aKS , bK S , 

1  aCVM,bCVM, aBO,bAD, nn, count , aZBD, bZBD,anda, esta, lent 

1  ,andb,estb, icntb,andab,estaa,estbb, icntab 

1  ,alCV,blCV,alKS,blKS 

integer  NPAR,NSIG,MflXFN, lOPT, n,c, nn,count( 4, 5, 14) , ient ,hh 
1  , ientb, icntab 

real  cvfie(  1 )  ,H(  1 )  ,G(  1 )  ,g(  3)  ,F, x(  18)  ,ablu ,bblu  ,aflD,bflD 
1  , B(  1 8 ) ,  D, fine , Bnc , BB( 1 8 ) , aKS , bKS , aCVM, bCVM , aZflD, bZBD 

1  , anda(  580 ) , esta(  500 ) , BOl , andb( 500 ) , estb(  500 ) , andab<  500 ) 

1  , est aa(  500 ) , est bb(  500 ) , al C V , bl C V , al K S , bl K S 

1  ,zl(18),SCVM,ftCV(18),UZ 

double  precision  dseed 
SCVM  =  0 
do  10  j=l,n 

zi(j)  =  l-(  l/(  l+( x( j )-cvne( 1 ) )/bblu ) )»*c 
flCV<j)  =  (zi(j)  -  (Z*j-l)/(Z*real(n)))**Z 
SCVM  =  SCVM  +  ftCV( j  ) 

10  continue 

UZ  =  SCVM  +  (l/(lZ*real(n))) 

F  =  gz 

return 

end 

Subroutine  CVBMD 


c  Purpose-  CVBMD  generates  the  nininun  distance  estimates  of 
c  the  scale  parameter  based  upon  minimizing  the 

c  Cramer-von  Mises  distance  measure  defined  in 

c  subroutine  CVBDIS.  CVBMD  uses  the  blu  estimates  as  the 

c  starting  points  for  the  estimate  modifications. 


c  Variables:  NPBR  =  number  of  parameters  altered  by  minimizing 

the  Cramer-von  Mises  (CVM)  distance 
number  of  significant  digits  for  convergence 
maximum  number  of  function  evaluations 
options  selector  (see  IMSL  manual  on  ZXMIN) 
vectors  defined  in  IMSL  manual  on  ZXMIN 
error  parameter  (see  IMSL  manual  on  ZXMIN) 
value  of  CVM  distance  at  the  final 
parameter  estimates 

CVM  derived  minimum  distance  estimate 
CVM  minimum  distance  scale  estimate 
blu  estimate  of  location 
biu  estimate  of  scale 

array  of  counters  used  to  count  the  number  of 


c 

c 

NSIG 

c 

MflXFN 

c 

lOPT 

c 

H,  G.  g 

c 

lER 

c 

F 

c 

c 

cvme 

c 

blCV 

c 

ablu 

c 

bblu 

c 

count 
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valid  estinaie  values  found 


c- 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 


c 


Inputs:  NPAR  =  number  of  parameters  altered  uhile  minimizing 

NSIG  =  number  of  significant  digits  required 
MftXFN  =  maximum  number  of  function  evaluations 
lOPT  =  options  selector  (see  IMSL  manual  on  ZXMl'N) 
cvme  =  initial  estimate  for  the  minimization  process 
ablu  =  blu  estimate  of  location 
bblu  =  blu  estimate  of  scale 


Outputs:  F  =  minimum  value  of  the  function  being  minimized 

cvme  =  revised  estimate  values 

blCV  =  revised  MO  estimate  of  scale  [blCV=cvme( 1 )] 

H,  G,  U  =  vectors  defined  in  IMSL  manual  on  ZXMIN 
lER  =  error  parameter  (see  IMSL  manual  on  ZXMIN) 


Calculate:  no  calculations  performed  in  this  subroutine 


•  **  Variables  Declaration:. 

common  n, x , c, ablu , bblu , dseed, B,0, Rnc, Bnc, 6B, aKS, bKS, 

1  aCVM, bCVM, aflO, bflO, nn, count , aZRO, bZ AO, anda, esta, lent 

1  ,andb,estb, icntb,andab,estaa,estbb, icntab 

1  ,alCV,blCV,alKS,blKS 

external  CVBDIS 

integer  NPAR,NSIG,MAXFN, I0PT,n,c, nn,count(4,5, 14) , lent ,hh 
1  , icntb, icntab 

real  cvme( 1 ) , H( 1 ) , G( 1 ) , U( 3 ) , F , x( 1 8 ) , ablu , bblu , aAD, bAD 
1  , B( 18 ) , 0, Anc, Bnc ,BB( 18) , aKS, bKS, aCVM, bCVM, aZAO, bZAD 

1  , enda(  500) , esta(  500) , AOl , andb( 500) , estb( 500) , andab( 500) 

1  ,estaa(500),estbb(500),alCV,blCV,alKS,blKS 

double  precision  dseed 
•*»  Enter  the  ZXMIN  required  constants 


NPAR  =  1 
NSIG  =  3 
MAXFN  =  500 
lOPT  =  0 


c  **»  Initialize  the  cvme  value  to  the  blu  estimate 

cvme( 1 )  =  bblu 


c  •**  Call  ZXMIN  to  refine  the  cvme  values  by  minimizing 

c  *•*  the  Cramei — von  Mises  distance  ( UZ )  computed  in 

c  **•  the  subroutine  CVBDIS 


call  ZXMIN(  CVBDIS, NPAR, NSIG, MAXFN, lOPT, cvme, H,G,F,U, lER) 
c  ***  Relabel  the  refined  estimates  of  scale 

blCV  =  cvme( 1 ) 

c  Increment  the  blCV  counter 

count( c, nn, IZ )  =  count! c, nn, IZ )  +  I 

return 

end 

Subroutine  CVBDIS! NPAR, cvme, F ) 


j, - 

c  Purpose:  CVBDIS  provides  the  function  uhich  is  to  be  minimized 
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by  ZXMIN  for  the  Cranei — von  Mises  distance  measure. 
The  scale  parameter  is  altered  to 
achieve  this  minimization. 


c  Variables: 

c 

c 

c 

c 

c 

c 

c 

c 

c 


NPAR  =  number  of  parameters  available  to  alter 
n  =  sample  size 

cvme  =  estimates  of  the  parameters  being  altered 
F  =  value  of  the  function  to  be  minimized 
X  =  array  of  ordered  Pareto  variates 
c  =  shape  parameter 
zi  =  array  of  Pareto  cdf  points 
flCV  =  the  squared  quantity  in  the  UZ  formula 
SCVM  =  the  sum  of  the  ftCV  quantities 
UZ  =  the  CVM  distance  measure 


c  Inputs: 

c 

c 

c 

c 


NPPR  =  number  of  parameters  available  to  alter 
n  =  sample  size 

cvme  =  initial  estimates  (the  blu  estimates) 

X  =  array  of  ordered  Pareto  variates 
c  =  shape  parameter 


c  Outputs:  F  =  value  of  the  function  at  the  fi.ial  estimates 

c  cvme  =  revised  estimates  of  scale; 

c  these  are  the  CVM  minimum  distance  estimates 


c  Calculations: 

c  z(i)  =  1  -  <1  +  Cx(i)-al/b>*»(-c> 

c 

c  flCV<i>  =  t  z(i)  -  (Zi-1)/Zn  ]»*Z 

c 

c  SCVM  =  flCV(l)  +  RCV(Z)  +  ...  +  ftCV(n) 

c 

c  UZ  =  SCVM  +  1/lZn 

c 


c  »**  Variable  Declarations: 

common  n, x , c, ablu ,bblu , dseed, B,0, fine, Bnc, BB, aKS, bKS, 

1  aCVM, bCVM, aflD.bAD, nn, count , aZAD.bZAD, anda, esta, icnt 

1  , andb, estb, icntb, andab,estaa,estbb, icntab 

1  ,alCV,blCV,alKS,blKS 

integer  NPAR, NSIG, MAXFN, lOPT, n, c, nn, countC 4, 5, 14 ) , icnt , hh 
1  , icntb, icntab 

real  cvme(  1 ) , H(  1 ) ,G( 1 ) , U( 3) ,F, x( 18 ) , ablu , bblu , aAD, bflO 
1  , 8(18 ) , D, ftnc, Bnc , BB( 1 8 ) , aKS, bKS, aCVM, bCVM, aZAD, bZflD 

1  , anda(  500) , esta(  500) , ADI , andb( 500) , estb( 500) , andab(  500) 

1  , e5taa( 500 ) , estbb( 500) ,alCV,blCV,alKS,blKS 

1  ,zi(  18),SCVM,ACV(  18),UZ 

double  precision  dseed 
SCVM  =  0 
do  10  j  =1 , n 

zi(j)  =  l-(  l/(  l  +  (  x(  j )-ablu )/cvme( 1 ) ) )»»c 
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flCV(])  =  <zi<]>  -  (  Z*j-1 )/(  Z»real< n> ) )**Z 
SCVM  =  SCVM  +  flCV(  j ) 
continue 

UZ  =  SCVM  +  (l/(lZ*real<n)>) 

F  =  UZ 
return 
end 

Subroutine  KSflMD 


c  Purpose:  KSflMD  generates  the  nininun  distance  estimates  of 
c  location  based  upon  minimizing  the 

c  Kolmogorov  distance  measure  defined  in  subroutine 

c  KflDIS.  This  routine  uses  the  blu  estimates  as  the 

c  starting  points  for  the  estimate  modifications. 


c  Variables: 


NSIG 
MflXFN 
I  OPT 
H,  G,  U 
lER 
F 

kse 

alKS 

ablu 

bblu 

count 


number  of  parameters  altered  by  minimizing 
the  Kolmogorov  distance 

number  of  significant  digits  for  convergence 
maximum  number  of  function  evaluations 
options  selector  (see  IMSL  manual  on  ZXMIN) 
vectors  defined  in  IMSL  manual  on  ZXMIN 
error  parameter  (see  IMSL  manual  on  ZXMIN) 
value  of  Kolmogorov  distance  at  the  final 
parameter  estimates 

Kolmogorov  derived  minimum  distance  estimates 
Kolmogorov  minimum  distance  location  estimate 
blu  estimate  of  location 
blu  estimate  of  scale 

array  of  counters  used  to  count  the  number  of 
valid  estimate  values  found 


c  Inputs: 

c 

c 

c 

c 

c 

c 


NPflR  =  number  of  parameters  altered  uhile  minimizing 
NSIG  =  number  of  significant  digits  requirei 
MflXFN  =  maximum  number  of  function  evaluations 
lOPT  =  options  selector  (see  IMSL  manual  on  ZXMIN) 
kse  =  initial  estimates  for  the  minimization  process 
ablu  =  blu  estimate  of  location 
bblu  =  blu  estimate  of  scale 


c  Outputs:  F  =  minimum  value  of  the  function  being  minimized 

c  kse  =  revised  estimate  values 

c  alKS  =  revised  MD  estimate  of  location  lalKS  =  kse(l)] 

c  H,  G,  U  =  vectors  defined  in  IMSL  manual  on  ZXMIN 

c  lER  =  error  parameter  (see  IMSL  manual  on  ZXMIN) 

g - - 

c  Calculate:  no  calculations  performed  in  this  subroutine 

c - 

c  *•*  Variables  Declaration: 

common  n , x , c , ab lu , bblu , dseed , B , D , flnc , Bnc , BB , aK S , bK S , 

1  aCVM.bCVM, aflD.bflD, nn, count , aZflD, bZflD,anda, esta, icnt 

I  , andb, estb, icntb, andab, estaa, estbb, icntab 
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1  ,alCV,blCV,alKS,blKS 

external  KADIS 

integer  NPAR, NSI6, MRXFN, lOPT, n, c, nn, count! 4, B, 14) , lent ,hh 
'  , icntb, icntab 

real  kse( 1 ) ,H< 1 ) ,G< 1 ) , W< 3) ,F, x( 18) , ablu ,bblu , aflO, bftD 
1  , B< 18 ) , D, ftnc, Bnc, BB{ 18 ) . aKS, bKS, aCVM, bCVM, aZAD.bZAD 

1  , anda(  500 ) , est  a( 500 ) , ADI , andb(  500 ) , estb(  500 ) , andab(  500 ) 

1  ,estaa(500),estbb(500),alCV,blCV,alKS,blKS 

1  ,zi(i8),SCVM,ACV(18),UZ 

double  precision  dseed 

c  ***  Enter  the  ZXMIN  required  constants 

NPAR  =  1 
NSI6  =  3 
MAXFN  =  500 
lOPT  =  0 

c  •••  Initialize  the  kse  values  to  the  blu  estimates 

kse(l)  =  ablu 

c  **•  Call  Z>^t1IN  to  refine  the  kse  values  by  nininizing 

c  •**  the  Kolmogorov  distance  (KST)  computed  in  the 

c  *•*  subroutine  KADIS 

call  ZXMIN! KADIS, NPAR, NSIG, MAXFN, lOPT, kse, H, 6, F,U,IER) 
c  Relabel  the  refined  estimates  of  location 

alKS  =  kse! 1 ) 

c  *♦*  Increment  the  alKS  counter 

count! c,nn, 13)  =  count! c,nn, 13)  +  1 

return 

end 

Subroutine  KADIS!NPAR,kse,F) 


c  Purpose:  KADIS  provides  the  function  uhich  is  to  be  minimized 
c  by  ZXMIN  for  the  Kolmogorov  distance  measure.  The 

c  location  parameter  is  altered  to  achieve 

c  this  minimization. 


=  number  of  parameters  available  to  alter 
=  sample  size 

=  estimates  of  the  parameters  being  altered 
=  value  of  the  function  to  be  minimized 
=  array  of  ordered  Pareto  variates 
=  shape  parameter 
=  array  of  Pareto  cdf  points 

=  positive  differences  betueen  the  EDF  and  cdf 
=  negative  differences  betueen  the  EDF  and  cdf 


c  Variables: 

NPAR 

c 

n 

c 

kse 

c 

F 

c 

X 

c 

c 

c 

zi 

c 

DP 

c 

DM 

c 

DPLUS 

c 

DMINUS 

c 

c - 

KST 

c  Inputs:  NPAR  =  number  of  parameters  available  to  alter 

c  n  =  sample  size 

c  kse  =  initial  jstimates  !the  blu  estimates) 

c  X  =  array  of  ordered  Pareto  variates 


=  shape  parameter 


Subroutine  KSBMD 


c  Purpose:  KSBMD  generates  the  nininum  distance  estimates  of 
c  scale  based  upon  nininizing  the 

c  Kolmogorov  distance  measure  defined  in  subroutine 

c  KBDIS.  This  routine  uses  the  blu  estimates  as  the 

c  starting  points  for  the  estimate  modifications. 


c  Variables:  NPflR  =  number  of  parameters  altered  by  minimizing 
c 

c  NSIG 
c  MflXFN 
c  lOPT 
c  H,  G,  U 
c  lER 
c  F 
c 

c  kse 
c  blKS 
c  ablu 
c  bblu 
c  count 
c  ' 


the  Kolmogorov  distance 

number  of  significant  digits  for  convergence 
maximum  number  of  function  evaluations 
options  selector  (see  IMSL  manual  on  ZXMIN) 
vectors  defined  in  IMSL  manual  on  ZXMIN 
error  parameter  (see  IMSL  manual  on  ZXMIN) 
value  of  Kolmogorov  distance  at  the  final 
parameter  estimates 

Kolmogorov  derived  minimum  distance  estimates 
Kolmogorov  minimum  distance  scale  estimate 
blu  estimate  of  location 
blu  estimate  of  scale 

array  of  counters  used  to  count  the  number  of 
valid  estimate  values  found 


Q - 

c  Inputs:  NPftR  =  number  of  parameters  altered  uhile  minimizing 

c  NSIG  =  number  of  significant  digits  required 

c  MflXFN  =  maximum  number  of  function  evaluations 

c  lOPT  =  options  selector  (see  IMSL  manual  on  ZXMIN) 

c  kse  =  initial  estimates  for  the  minimization  process 

c  ablu  =  blu  estimate  of  location 

c  bblu  =  blu  estimate  of  scale 

- 

c  Outputs'  F  =  minimum  value  of  the  function  being  minimized 

c  kse  =  revised  estimate  values 

c  blKS  =  revised  MD  estimate  of  scale  CblKS  =  kse(l)] 

c  H,  G,  U  =  vectors  defined  in  IMSL  manual  on  ZXMIN 

c  lER  =  error  parameter  (see  IMSL  manual  on  ZXMIN) 


c  Calculate: 
c - 


no  calculations  performed  in  this  subroutine 


*•*  Variables  Declaration: 

common  n , x , c , ablu , bblu , dseed, B, D, flnc, Bnc, BB, aKS , bKS, 

1  aCVM.bCVM, aflD, bflD, nn, count , aZflD, bZflD, anda, esta, icnt 

1  ,andb,estb, icntb,andab,estaa,estbb, icntab 

1  ,alCV,blCV,alKS,blKS 

external  KBDIS 

integer  NPflR, NSIG, MflXFN, lOPT, n, c, nn, count ( 4,5, 14) , lent , hh 
1  , icntb, icntab 

real  kse(  1 ) , H(  1 ) ,G(  1 ) ,U( 3) ,F, x( 18) ,ablu ,bblu , aflO, bflD 
1  ,B( 18),D,flnc,Bnc,BB( 18),aKS,bKS,aCVM,bCVM,aZflD,b2flD 

1  , anda( 508) , esta( 500) ,flDl ,andb(  500) ,estb(  500) , andab(  500) 
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1 

1 


c 


c 

c 

c 

c 

c 

c 


,estaa(  500) , esibb(  500) , aiCV,blCV,alKS,blKS 
,zi(18),5CVM,flCV(18),U2 
double  precision  dseed 
***  Enter  the  ZXMIN  required  constants 
NPflR  =  1 
NSIG  =  3 
MflXFN  =  500 
lOPT  =  0 

**•  Initialize  the  kso  values  to  the  blu  estimates 
kse( 1 )  =  bblu 

»»*  Call  ZXMIN  to  refine  the  kse  values  by  minimizing 
***  the  Kolmogorov  distance  (KST)  computed  in  the 

**»  subroutine  K8DIS 

call  ZXMIN(K3DIS,fff>AR, NSIG, MflXFN.IOPT, kse, H,G,F,U,IER) 
***  Relabel  the  refined  estimates  of  location 
blKS  =  kse(l) 

**•  Increment  the  blKS  counter 
count( c, nn, 14)  =  count(c,nn,14)  +  1 
return 
end 

Subroutine  K80IS< NPAR,kse,F ) 


c - 

c  Purpose: 
c 
c 
c 

c - 

c  Variables: 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c - 

c  Inputs: 

c 

c 

c 

c 

c - 

C  Outputs: 
c 
c 
c 


KBOIS  provides  the  function  which  is  to  be  minimized 
by  ZXMIN  for  the  Kolmogorov  distance  measure.  The 
scale  parameter  is  altered  to  achieve 
this  minimization. 


NPflR  =  number  of  parameters  available  to  alter 
n  =  sample  size 

kse  =  estimates  of  the  parameters  being  altered 
F  =  value  of  the  function  to  be  minimized 
X  =  array  of  ordered  Pareto  variates 
c  =  shape  parameter 
zi  =  array  of  Pareto  cdf  points 

DP  =  positive  differences  between  the  EOF  and  cdf 
DM  =  negative  differences  between  the  EDF  and  cdf 
DPLUS  =  maximum  positive  difference 
DMINUS  =  maximum  negative  difference 
KST  =  Maximum  of  DPLUS  and  DMINUS 


NPflR  =  number  of  parameters  available  to  alter 
n  =  sample  size 

kse  =  initial  estimates  (the  blu  estimates) 

X  =  array  of  ordered  Pareto  variates 
c  =  shape  parameter 


F  =  value  of  the  function  at  the  final  estimates 
kse  =  revised  estimates  of  scale; 

these  are  the  Kolmogorov  minimum  distance 
estimates 


c  Calculations: 

c  z(i)  =  !-(!  +  [  x(  1  )-a] /b  )*»( -c) 

c 

c  [^(i)  =  flBSt  i/n  -  z<i)  1 

c 

c  DM(i)  =  flBSt  z(i)  -  (i-l)/n  1 

c 

c - 

c  *♦*  Variable  Declarations: 

connon  n , x , c , ablu , bblu , dseed , B , 0, Bnc , Bnc , BB , aK S , bKS , 

1  aCVM, bCVn, aRD, faflD, nn, count , aZRD, bZRD, anda, esta, icnt 

1  , andb, estb, icntb, andab, estaa, estbb, icntab 

1  ,alCV,blCV,alKS,blKS 

integer  NPftR,NSIG,l1flXFN,  lOPT,  n,  c,  nn,count(  4, 5, 14) ,  icnt  ,hh 
1  , icntb, icntab 

real  kse( 1 ) ,H(  1 ) ,G( 1 ) ,U( 3) ,F, x( 18) , ablu ,bblu .aAD.bAO 
1  ,B<18),0,Rnc,Bnc,BB<i8),aKS,bKS,aCVM,bCVn,aZR0,bZR0 

1  , anda( 500 ) , est  a( 500 ) , ROl , andb( 500 ) , estb<  500 ) , andab( 500 ) 

1  , est aa<  500 ) , est bb<  500 ) , al C V , bl C V , al K S , bl K S 

1  ,zi(18),DP(18),DM( 18),DPLUS,DMINUS,KST 

double  precision  dseed 

c  ***  Calculate  the  Pareto  cdf  value  [zi(j)l  at  each  point 

c  **♦  and  the  differences  between  the  EDF  step  function 

c  ***  and  the  cdf  points 

do  10  j=l,n 

zi<j)  =  l-(l/<l+(x(  j  )-ablu)'!'se(l)))«*c 

DP(j)  =  flBS(j/real(n)  -  zi.j). 

DM(j>  =  flBS(zi(j)  -  ( j-l)/'eal(n)) 

10  continue 

c  »**  Select  the  naxinun  of  the  plus  and  ninus  differences 

□PLUS  =  MflX(DP(l),DP(Z),DP(3),DP(4),DP<5),DP(G),DP(7) 

1  ,DP(8),DP(9),DP(10),DP<11),DP<1Z),DP(13),DP(14) 

1  ,DP(15),DP(1G),DP(17),DP(18)) 

□MINUS  =  MflX(DM(l),DM(Z),DM(3),DM(4),DM(5),DM(B),DM<7) 

1  ,DM(8),DM(9),DM(10),DM<  11),0M<1Z),DM(13),DM(14) 

1  ,DM(15),DM(1G),DM(17),DM(18>) 

c  ***  Select  the  naxinun  Kolnogorov  distance  neasure  and 

c  ***  set  F  equal  to  that  distance.  F  becones  the 

c  ***  function  uhich  2XMIN  attenpts  to  nininize  by 

c  **•  altering  the  values  of  the  location  paraneter 

KST  =  MflX(DPLUS,DMINUS) 

F  =  KST 

return 

end 
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